Title
Extraction of meaningful tables from the internet using decision trees
Abstract
The information retrieval system currently in use fails to consider the structural information of documents but uses extracted indexes from documents instead. Structural information such as the font face, font size, indentation, tables, and etc. demonstrate the author's meaning and is clearly the prime means of documentation. This paper pays special attention to tables because tables are commonly used within many documents to make the meanings clear, which are well recognized because web documents use tags for additional information. On the Internet, tables are used for the purpose of the structure of knowledge and also the design of documents. This report will propose a method of extracting meaningful tables using a decision tree and to construct a dictionary of table indexes in order to apply an information retrieval system and thus enhance the accuracy.
Year
DOI
Venue
2003
10.1007/3-540-45034-3_18
IEA/AIE
Keywords
Field
DocType
font face,special attention,information retrieval system,additional information,meaningful table,structural information,decision tree,prime mean,font size,table index,indexation
Information system,Decision tree,Data mining,Decision table,Computer science,Artificial intelligence,Vector space model,The Internet,Point (typography),Information retrieval,Font,Documentation,Machine learning
Conference
Volume
ISSN
ISBN
2718
0302-9743
3-540-40455-4
Citations 
PageRank 
References 
1
0.39
7
Authors
4
Name
Order
Citations
PageRank
Sungwon Jung132059.65
Won-Hee Lee2355.70
Sang Kyu Park32812.49
Hyuk-Chul Kwon413629.02