Sample-Based Attribute Selective A$n$ DE for Large Data - Citegraph

Paper Info

Title
Sample-Based Attribute Selective A$n$ DE for Large Data

Abstract
More and more applications have come with large data sets in the past decade. However, existing algorithms cannot guarantee to scale well on large data. Averaged n-Dependence Estimators (AnDE) allows for flexible learning from out-of-core data, by varying the value of $n$ (number of super parents). Hence, AnDE is especially appropriate for large data learning. In this paper, we propose a sample-based attribute selection technique for AnDE. It needs one more pass through the training data, in which a multitude of approximate AnDE models are built and efficiently assessed by leave-one-out cross validation. The use of a sample reduces the training time. Experiments on 15 large data sets demonstrate that the proposed technique significantly reduces AnDE's error at the cost of a modest increase in training time. This efficient and scalable out-of-core approach delivers superior or comparable performance to typical in-core Bayesian network classifiers.

Year	DOI	Venue
2017	10.1109/TKDE.2016.2608881	IEEE Transactions on Knowledge and Data Engineering
Keywords	Field	DocType
Niobium,Bayes methods,Training,Training data,Information technology,Australia,Memory management	Training set,Data mining,Data set,Feature selection,Computer science,Memory management,Bayesian network,Artificial intelligence,Cross-validation,Machine learning,Estimator,Scalability	Journal
Volume	Issue	ISSN
29	1	1041-4347
Citations	PageRank	References
1	0.35	25
Authors
4

Authors (4 rows)

Cited by (1 rows)

References (25 rows)

Name	Order	Citations	PageRank
Shenglei Chen	1	18	4.05
Ana M. Martínez	2	47	5.78
Geoffrey I. Webb	3	99	12.05
LiMin Wang	4	816	48.41

1