Title
A Gray Code Based Ordering For Documents On Shelves - Classification For Browsing And Retrieval
Abstract
A document classifier places documents together in a linear arrangement for browsing or high-speed access by human or computerized information retrieval systems. Requirements for document classification and browsing systems are developed from similarity measures, distance measures, and the notion of subject aboutness. A requirement that documents be arranged in decreasing order of similarity as the distance from a given document increases can often not be met. Based on these requirements, information-theoretic considerations, and the Gray code, a classification system is proposed that can classify documents without human intervention. It provides a theoretical justification for individual classification numbers going from broad to narrow topics when moving from left to right in the classification number. A general measure of classifier performance is developed, and used to evaluate experimental results comparing the distance between subject headings assigned to documents given classifications from the proposed system and the Library of Congress Classification (LCC) system. Browsing in libraries, hyper-text, and databases is usually considered to be the domain of subject searches. The proposed system can incorporate both classification by subject and by other forms of bibliographic information, allowing for the generalization of browsing to include all features of an information carrying unit.
Year
DOI
Venue
1992
10.1002/(SICI)1097-4571(199205)43:4<312::AID-ASI7>3.0.CO;2-Z
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
Keywords
Field
DocType
classification system,classification,comparative analysis,gray code,information retrieval system,information retrieval,library of congress classification,subject headings
Document classification,Library of Congress Classification,Library classification,Data mining,Information retrieval,Subject access,Computer science,Information science,Cataloging,Automatic indexing,Distance measures
Journal
Volume
Issue
ISSN
43
4
0002-8231
Citations 
PageRank 
References 
16
1.10
8
Authors
1
Name
Order
Citations
PageRank
Robert M. Losee127636.01