Title
Mining Twitter Data with Resource Constraints
Abstract
Social media analysis constitutes a scientific field that is rapidly gaining ground due to its numerous research challenges and practical applications, as well as the unprecedented availability of data in real time. Several of these applications have significant social and economical impact, such as journalism, crisis management, advertising, etc. However, two issues regarding these applications have to be confronted. The first one is the financial cost. Despite the abundance of information, it typically comes at a premium price, and only a fraction is provided free of charge. For example, Twitter, a predominant social media online service, grants researchers and practitioners free access to only a small proportion (1%) of its publicly available stream. The second issue is the computational cost. Even when the full stream is available, off the shelf approaches are unable to operate in such settings due to the real-time computational demands. Consequently, real world applications as well as research efforts that exploit such information are limited to utilizing only a subset of the available data. In this paper, we are interested in evaluating the extent to which analytical processes are affected by the aforementioned limitation. In particular, we apply a plethora of analysis processes on two subsets of Twitter public data, obtained through the service's sampling API's. The first one is the default 1% sample, whereas the second is the Garden hose sample that our research group has access to, returning 10% of all public data. We extensively evaluate their relative performance in numerous scenarios.
Year
DOI
Venue
2014
10.1109/WI-IAT.2014.29
IAT), 2014 IEEE/WIC/ACM International Joint Conferences  
Keywords
Field
DocType
application program interfaces,data mining,pricing,social networking (online),Garden hose sample,Twitter public data,computational cost,financial cost,mining Twitter data,premium price,resource constraints,service sampling API,social media analysis,social media online service
Ontology,World Wide Web,Knowledge representation and reasoning,Social media,Journalism,Sentiment analysis,Computer science,Exploit,Crisis management,Ontology learning
Conference
Volume
ISBN
Citations 
1
978-1-4799-4143-8-01
1
PageRank 
References 
Authors
0.35
23
4
Name
Order
Citations
PageRank
George Valkanas114811.70
Ioannis Katakis2234.24
Dimitrios Gunopulos37171715.85
Anthony Stefanidis441964.37