Title
Parallel and Distributed Search for Structure in Multivariate Time Series
Abstract
Efficient data mining algorithms are crucial for effective knowledge discovery. We present the Multi-Stream Dependency Detection (\Msdd) data mining algorithm that performs a systematic search for structure in multivariate time series of categorical data. The systematicity of \Msdd''s search makes implementation of both parallel and distributed versions straightforward. Distributing the search for structure over multiple processors or networked machines makes mining of large numbers of databases or very large databases feasible. We present results showing that \msdd efficiently finds complex structure in multivariate time series, and that the distributed version finds the same structure in approximately $1/n$ of the time required by \Msdd, where $n$ is the number of machines across which the search is distributed. \msdd differs from other data mining algorithms in the complexity of the structure that it can find. \msdd also requires no domain knowledge to focus or limit its search, although such knowledge is easily incorporated when it is available.
Year
DOI
Venue
1997
10.1007/3-540-62858-4_84
ECML
Keywords
DocType
Volume
multivariate time series,systematic search,complex structure,efficient data mining algorithm,large databases,effective knowledge discovery,large number,categorical data,domain knowledge,data mining algorithm
Conference
1224
ISBN
Citations 
PageRank 
3-540-62858-4
5
1.81
References 
Authors
12
3
Name
Order
Citations
PageRank
Tim Oates11069190.77
Matthew D. Schmill29814.67
paul r cohen31927460.49