Parallel index and query for large scale data analysis - Citegraph

Paper Info

Title
Parallel index and query for large scale data analysis

Abstract
Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for processing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process massive datasets on modern supercomputing platforms. We apply FastQuery to processing of a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for interesting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.

Year	DOI	Venue
2011	10.1145/2063384.2063424	SC
Keywords	Field	DocType
large scale data analysis,query technology,massive datasets,modern scientific datasets,state-of-the-art index,parallel index,modern supercomputing platform,scientific need,large scale accelerator modeling,novel software framework,large datasets,general scientific datasets,indexing,indexation,software framework,parallel processing,data management,data analysis	Supercomputer,Computer science,Parallel computing,Parallel processing,Search engine indexing,Data management,Software framework,Distributed computing,Scalability	Conference
Citations	PageRank	References
38	1.68	14
Authors
10

Authors (10 rows)

Cited by (38 rows)

References (14 rows)

Name	Order	Citations	PageRank
Jerry Chou	1	55	3.69
Mark Howison	2	106	9.21
Brian Austin	3	48	4.12
Kesheng Wu	4	1231	108.30
Ji Qiang	5	79	10.07
E. Wes Bethel	6	438	39.76
Arie Shoshani	7	1701	675.01
Oliver Rübel	8	103	11.78
Prabhat	9	456	34.79
Rob D. Ryne	10	38	1.68

1