Abstract | ||
---|---|---|
Mining web traffic data has been addressed in literature mostly using sequential pattern mining techniques. Recently, a more powerful pattern called partial order was introduced, with the hope of providing a more compact result set. A further approach towards this goal, valid for both sequential patterns and partial orders, consists in building a statistical significance test for frequent patterns. Our method is based on probabilistic generative models and provides a direct way to rank the extracted patterns. It leaves open the number of patterns of interest, which depends on the application, but provides an alternative criterion to frequency of occurrence: statistical significance. In this paper, we focus on the construction of an algorithm which calculates the probability of partial orders under a first-order Markov reference model, and we show how to use those probabilities to assess the statistical significance of a set of mined partial orders. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/ICDM.2011.122 | ICDM |
Keywords | Field | DocType |
sequential pattern,markov reference model,ranking web-based partial orders,mining web traffic data,sequential pattern mining technique,compact result set,mined partial order,frequent pattern,partial order,powerful pattern,statistical significance test,statistical significance,markov,probability,data mining,poset,markov processes,statistical testing,ranking,sequential pattern mining,test,reference model,internet,first order,web,pattern | Data mining,Markov process,Reference model,Result set,Ranking,Computer science,Markov chain,Artificial intelligence,Probabilistic logic,Machine learning,Statistical hypothesis testing,Partially ordered set | Conference |
Citations | PageRank | References |
2 | 0.37 | 6 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Michel Speiser | 1 | 5 | 0.77 |
Gianluca Antonini | 2 | 192 | 13.67 |
A Labbi | 3 | 20 | 4.43 |