Title
Text windows and phrases differing by discipline, location in document, and syntactic structure
Abstract
Knowledge of window style, content, location, and grammatical structure may be used to classify documents as originating within a particular discipline or may be used to place a document on a theory vs practice spectrum. This distinction is also studied here using the type-token ratio to differentiate between sublanguages. The statistical significance of windows is computed, based on the presence of terms in titles, abstracts, citations, and section headers, as well as binary-independent and inverse-document-frequency weightings. The characteristics of windows are studied by examining their within-window density and the S concentration, the concentration of terms from various document fields (e.g. title, abstract) in the fulltext. The rate of window occurrences from the beginning to the end of document fulltext differs between academic fields. Different syntactic structures in sublanguages are examined, and their use is considered for discriminating between specific academic disciplines and, more generally, between theory vs practice or knowledge vs applications-oriented documents.
Year
DOI
Venue
1996
10.1016/S0306-4573(96)00017-9
Inf. Process. Manage.
Keywords
DocType
Volume
syntactic structure,text windows,information retrieval,inverse document frequency,spectrum,syntax,statistical significance,classification,mathematical formulas
Journal
32
Issue
ISSN
Citations 
6
Information Processing and Management
9
PageRank 
References 
Authors
1.25
28
1
Name
Order
Citations
PageRank
Robert M. Losee127636.01