Title
Evaluating cluster preservation in frequent itemset integration for distributed databases.
Abstract
Medical sciences are rapidly emerging as a data rich discipline where the amount of databases and their dimensionality increases exponentially with time. Data integration algorithms often rely upon discovering embedded, useful, and novel relationships between feature attributes that describe the data. Such algorithms require data integration prior to knowledge discovery, which can lack the timeliness, scalability, robustness, and reliability of discovered knowledge. Knowledge integration algorithms offer pattern discovery on segmented and distributed databases but require sophisticated methods for pattern merging and evaluating integration quality. We propose a unique computational framework for discovering and integrating frequent sets of features from distributed databases and then exploiting them for unsupervised learning from the integrated space. Assorted indices of cluster quality are used to assess the accuracy of knowledge merging. The approach preserves significant cluster quality under various cluster distributions and noise conditions. Exhaustive experimentation is performed to further evaluate the scalability and robustness of the proposed methodology.
Year
DOI
Venue
2011
10.1007/s10916-010-9512-1
J. Medical Systems
Keywords
Field
DocType
knowledge integration algorithms offer,pattern discovery,cluster preservation,assorted index,cluster quality,knowledge merging.frequent patterns. clustering.quality indices.distributed databases,integration quality,significant cluster quality,various cluster distribution,frequent itemset integration,knowledge discovery,data rich discipline,data integration,clustering,distributed databases
Data integration,Data mining,Knowledge integration,Computer science,Robustness (computer science),Unsupervised learning,Knowledge extraction,Distributed database,Cluster analysis,Scalability
Journal
Volume
Issue
ISSN
35
5
0148-5598
Citations 
PageRank 
References 
0
0.34
11
Authors
3
Name
Order
Citations
PageRank
Sumeet Dua127524.31
Michael P Dessauer200.34
Prerna Sethi341.14