Title
Probabilistic modeling of transaction data with applications to profiling, visualization, and prediction
Abstract
Transaction data is ubiquitous in data mining applications. Examples include market basket data in retail commerce, telephone call records in telecommunications, and Web logs of individual page-requests at Web sites. Profiling consists of using historical transaction data on individuals to construct a model of each individual's behavior. Simple profiling techniques such as histograms do not generalize well from sparse transaction data. In this paper we investigate the application of probabilistic mixture models to automatically generate profiles from large volumes of transaction data. In effect, the mixture model represents each individual's behavior as a linear combination of "basis transactions." We evaluate several variations of the model on a large retail transaction data set and show that the proposed model provides improved predictive power over simpler histogram-based techniques, as well as being relatively scalable, interpretable, and flexible. In addition we point to applications in outlier detection, customer ranking, interactive visualization, and so forth. The paper concludes by comparing and relating the proposed framework to other transaction-data modeling techniques such as association rules.
Year
DOI
Venue
2001
10.1145/502512.502523
KDD
Keywords
Field
DocType
basis transaction,probabilistic modeling,historical transaction data,sparse transaction data,large retail transaction data,probabilistic mixture model,mixture model,market basket data,transaction data,data mining application,probabilistic model,em algorithm,interactive visualization,association rule,outlier detection,mixture models
Data mining,Visualization,Computer science,Profiling (computer programming),Association rule learning,Interactive visualization,Artificial intelligence,Probabilistic logic,Transaction data,Mixture model,Machine learning,Scalability
Conference
ISBN
Citations 
PageRank 
1-58113-391-X
29
2.03
References 
Authors
5
3
Name
Order
Citations
PageRank
Igor V. Cadez126329.24
Padhraic Smyth271481451.38
Heikki Mannila365951495.69