Title
Scalability achievements for enumerative biclustering with online partitioning: Case studies involving mixed-attribute datasets
Abstract
Biclustering is a powerful data analysis technique and its concept is appealing in many domains, such as natural sciences and market basket analysis. To exemplify the wide range of biclustering applications, we can also mention recommender systems, educational data mining, emerging topic detection and counterfeit product detection. In this paper, we further extend RIn-Close_CVC, a biclustering algorithm capable of performing, in numerical datasets, an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns. By avoiding a priori partitioning and itemization of the dataset, RIn-Close_CVC implements an online partitioning, which is demonstrated here to guide to more informative biclustering results. The improved algorithm, called RIn-Close_CVC3, is characterized by: a drastic reduction in memory usage; a consistent gain in runtime; additional ability to handle datasets with missing values; and new skills to operate with attributes characterized by distinct distributions or even mixed data types. Moreover, RIn-Close_CVC3 keeps those four attractive properties of RIn-Close_CVC, as formally proved here. The experimental results include synthetic and real-world datasets used to perform scalability and sensitivity analyses, besides a comparative inquiry involving a priori and online partitioning. As a practical case study, a parsimonious set of relevant and interpretable mixed-attribute-type rules is obtained in the context of supervised descriptive pattern mining.
Year
DOI
Venue
2021
10.1016/j.engappai.2020.104147
Engineering Applications of Artificial Intelligence
Keywords
DocType
Volume
Enumerative biclustering,Online partitioning of numerical datasets,Efficient enumeration,Quantitative class association rules,Supervised descriptive pattern mining,Mixed-attribute datasets
Journal
100
ISSN
Citations 
PageRank 
0952-1976
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Rosana Veroneze162.51
Fernando J. Von Zuben283181.83