Abstract | ||
---|---|---|
The information retrieval system currently in use fails to consider the structural information of documents but uses extracted indexes from documents instead. Structural information such as the font face, font size, indentation, tables, and etc. demonstrate the author's meaning and is clearly the prime means of documentation. This paper pays special attention to tables because tables are commonly used within many documents to make the meanings clear, which are well recognized because web documents use tags for additional information. On the Internet, tables are used for the purpose of the structure of knowledge and also the design of documents. This report will propose a method of extracting meaningful tables using a decision tree and to construct a dictionary of table indexes in order to apply an information retrieval system and thus enhance the accuracy. |
Year | DOI | Venue |
---|---|---|
2003 | 10.1007/3-540-45034-3_18 | IEA/AIE |
Keywords | Field | DocType |
font face,special attention,information retrieval system,additional information,meaningful table,structural information,decision tree,prime mean,font size,table index,indexation | Information system,Decision tree,Data mining,Decision table,Computer science,Artificial intelligence,Vector space model,The Internet,Point (typography),Information retrieval,Font,Documentation,Machine learning | Conference |
Volume | ISSN | ISBN |
2718 | 0302-9743 | 3-540-40455-4 |
Citations | PageRank | References |
1 | 0.39 | 7 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sungwon Jung | 1 | 320 | 59.65 |
Won-Hee Lee | 2 | 35 | 5.70 |
Sang Kyu Park | 3 | 28 | 12.49 |
Hyuk-Chul Kwon | 4 | 136 | 29.02 |