Title
Automatic text representation, classification and labeling in European law
Abstract
The huge text archives and retrieval systems of legal information have not achieved yet the representation in the well-known subject-oriented structure of legal commentaries. Content-based classification and text analysis remains a high priority research topic. In the joint KONTERM, SOM and LabelSOM projects, learning techniques of neural networks are used to achieve similar high compression rates of classification and analysis like in manual legal indexing. The produced maps of legal text corpora cluster related documents in units that are described with automatically selected descriptors. Extensive tests with text corpora in European case law have shown the feasibility of this approach. Classification and labeling proved very helpful for legal research. The Growing Hierarchical Self-Organizing Map represents very interesting generalities and specialties of legal text corpora. The segmentation into document parts improved very much the quality of labeling. The next challenge would be a change from tf × idf vector representation to a modified vector representation taking into account thesauri or ontologies considering learned properties of legal text corpora.
Year
DOI
Venue
2001
10.1145/383535.383544
ICAIL
Keywords
Field
DocType
text analysis,huge text archives,legal text corpus,text corpus,legal text corpora cluster,content-based classification,european law,manual legal indexing,legal research,automatic text representation,legal information,legal commentary,indexation,neural network
Data mining,Computer science,Search engine indexing,Natural language processing,Artificial intelligence,Artificial neural network,Ontology (information science),Text mining,Information retrieval,Segmentation,Legal research,Text corpus,European Union law
Conference
ISBN
Citations 
PageRank 
1-58113-368-5
20
0.95
References 
Authors
14
3
Name
Order
Citations
PageRank
Erich Schweighofer125032.37
Andreas Rauber21925216.21
Michael Dittenbach329726.48