Abstract | ||
---|---|---|
This paper describes a propositionalization technique called wordification. Wordification is inspired by text mining and can be seen as a transformation of a relational database into a corpus of documents. Wordification aims at producing simple, easy to understand features, acting as words in the transformed Bag-Of-Words representation. As in other propositionalization methods, after the wordification step any propositional data mining algorithm can be applied. The most notable advantage of the presented technique is greater scalability: the propositionalization step is done in time linear to the number of attributes times the number of examples. The paper presents the wordification methodology, implemented in a cloud-based web data mining platform Clowd-Flows, and describes the experiments in two real-life datasets together with a critical comparison to the RSD propositionalization approach. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1007/978-3-642-40897-7_10 | DISCOVERY SCIENCE |
Keywords | Field | DocType |
relational data mining, propositionalization, text mining, association rules, classification | Web data mining,Data mining,Relational database,Computer science,Relational data mining,Association rule learning,Artificial intelligence,Data mining algorithm,Machine learning,Cloud computing,Scalability | Conference |
Volume | ISSN | Citations |
8140 | 0302-9743 | 3 |
PageRank | References | Authors |
0.38 | 12 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Matic Perovsek | 1 | 26 | 3.02 |
Anze Vavpetic | 2 | 52 | 6.49 |
Bojan Cestnik | 3 | 716 | 262.57 |
Nada Lavrac | 4 | 2004 | 635.45 |