Abstract | ||
---|---|---|
Hierarchical document classification refers to assigning one or more suitable categories from a hierarchical category space to a document. This paper proposes a new hierarchical document classification method based on a backtracking algorithm. Utilizing the relationships between categories in category tree, a suitable threshold for every category is found to determine whether a document could be classified into the category. And the backtracking algorithm in our hierarchical classification approach effectively solves the problem that a misclassification at higher level directly leads to the misclassification at a lower level. Moreover, feature set is selected by integrating information gain with hierarchy information, which accords with the characteristic of a category tree. Experiments show that the method performs well when enough training documents are given. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1109/FSKD.2008.346 | FSKD (2) |
Keywords | Field | DocType |
backtracking,suitable category,backtracking algorithm,hierarchical category space,tree searching,een category,hierarchical document classification,enough training document,hierarchy information,information gain,hierarchical classification approach,feature extraction,feature set selection,category tree,hierarchical document classification method,classification,new hierarchical document classification,document handling,support vector machines,pediatrics,classification algorithms | Data mining,Computer science,Feature set,Artificial intelligence,Hierarchy,Backtracking,Document classification,Pattern recognition,Support vector machine,Information gain,Feature extraction,Statistical classification,Machine learning | Conference |
Volume | ISBN | Citations |
2 | 978-0-7695-3305-6 | 1 |
PageRank | References | Authors |
0.36 | 7 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Cuiling Zhu | 1 | 5 | 1.45 |
Jun Ma | 2 | 1280 | 127.50 |
Dongmei Zhang | 3 | 1439 | 132.94 |
Xiaohui Han | 4 | 17 | 5.41 |
Xiaofei Niu | 5 | 15 | 4.37 |