Title
Three Case Studies of Large-Scale Data Flows
Abstract
We survey three examples of large-scale scientific workflows that we are working with at Cornell: the Arecibo sky survey, the CLEO high-energy particle physics experiment, and the Web Lab project for enabling social science studies of the Internet. All three projects face the same general challenges: massive amounts of raw data, expensive processing steps, and the requirement to make raw data or data products available to users nation- or world-wide. However, there are several differences that prevent a one-sizefits- all approach to handling their data flows. Instead, current implementations are heavily tuned by domain and data management experts. We describe the three projects, and we outline research issues and opportunities to integrate Grid technology into these workflows.
Year
DOI
Venue
2006
10.1109/ICDEW.2006.148
ICDE Workshops
Keywords
Field
DocType
large-scale data flows,grid technology,data management expert,data product,web lab project,large-scale scientific workflows,case studies,data flow,cleo high-energy particle physic,raw data,arecibo sky survey,current implementation,space technology,physics,computer aided software engineering,face,internet,arm,astronomy,computer science
Data science,Data mining,Grid computing,Space technology,Computer science,Raw data,Implementation,Computer-aided software engineering,Data management,Workflow,Database,The Internet
Conference
ISBN
Citations 
PageRank 
0-7695-2571-7
2
0.38
References 
Authors
3
16
Name
Order
Citations
PageRank
William Y. Arms118425.03
selcuk aya2161.56
manuel calimlim3303.21
james m cordes420.38
j s deneva570.92
p p dmitriev613911.77
Johannes Gehrke7133621055.06
gehrke820.38
l gibbons990.90
c d jones1090.90
v e kuznetsov1120.72
david lifka1220.38
Mirek Riedewald13113684.31
d riley1420.38
a ryd1520.38
g j sharp1612516.48