Title
Tornado: A Distributed Spatio-Textual Stream Processing System.
Abstract
The widespread use of location-aware devices together with the increased popularity of micro-blogging applications (e.g., Twitter) led to the creation of large streams of spatio-textual data. In order to serve real-time applications, the processing of these large-scale spatio-textual streams needs to be distributed. However, existing distributed stream processing systems (e.g., Spark and Storm) are not optimized for spatial/textual content. In this demonstration, we introduce Tornado, a distributed in-memory spatio-textual stream processing server that extends Storm. To efficiently process spatio-textual streams, Tornado introduces a spatio-textual indexing layer to the architecture of Storm. The indexing layer is adaptive, i.e., dynamically re-distributes the processing across the system according to changes in the data distribution and/or query workload. In addition to keywords, higher-level textual concepts are identified and are semantically matched against spatio-textual queries. Tornado provides data deduplication and fusion to eliminate redundant textual data. We demonstrate a prototype of Tornado running against real Twitter streams, where the users can register continuous or snapshot spatio-textual queries using a map-assisted query-interface.
Year
DOI
Venue
2015
10.14778/2824032.2824126
PVLDB
Field
DocType
Volume
Data deduplication,Data mining,Architecture,Tornado,Spark (mathematics),Computer science,Storm,Search engine indexing,Stream processing,Snapshot (computer storage),Database
Journal
8
Issue
ISSN
Citations 
12
2150-8097
15
PageRank 
References 
Authors
0.57
6
10
Name
Order
Citations
PageRank
Ahmed R. Mahmood1547.37
Ahmed M. Aly211412.62
Thamir Qadah3183.98
El Kindi Rezig4225.75
Anas Daghistani5204.37
Amgad Madkour6526.76
Ahmed S. Abdelhamid7213.40
Mohamed A. S. Hassan87919.44
Walid G. Aref94502419.49
Saleh M. Basalamah108914.27