Automatic text representation, classification and labeling in European law - Citegraph

Paper Info

Title
Automatic text representation, classification and labeling in European law

Abstract
The huge text archives and retrieval systems of legal information have not achieved yet the representation in the well-known subject-oriented structure of legal commentaries. Content-based classification and text analysis remains a high priority research topic. In the joint KONTERM, SOM and LabelSOM projects, learning techniques of neural networks are used to achieve similar high compression rates of classification and analysis like in manual legal indexing. The produced maps of legal text corpora cluster related documents in units that are described with automatically selected descriptors. Extensive tests with text corpora in European case law have shown the feasibility of this approach. Classification and labeling proved very helpful for legal research. The Growing Hierarchical Self-Organizing Map represents very interesting generalities and specialties of legal text corpora. The segmentation into document parts improved very much the quality of labeling. The next challenge would be a change from tf × idf vector representation to a modified vector representation taking into account thesauri or ontologies considering learned properties of legal text corpora.

Year	DOI	Venue
2001	10.1145/383535.383544	ICAIL
Keywords	Field	DocType
text analysis,huge text archives,legal text corpus,text corpus,legal text corpora cluster,content-based classification,european law,manual legal indexing,legal research,automatic text representation,legal information,legal commentary,indexation,neural network	Data mining,Computer science,Search engine indexing,Natural language processing,Artificial intelligence,Artificial neural network,Ontology (information science),Text mining,Information retrieval,Segmentation,Legal research,Text corpus,European Union law	Conference
ISBN	Citations	PageRank
1-58113-368-5	20	0.95
References	Authors
14	3

Authors (3 rows)

Cited by (20 rows)

References (14 rows)

Name	Order	Citations	PageRank
Erich Schweighofer	1	250	32.37
Andreas Rauber	2	1925	216.21
Michael Dittenbach	3	297	26.48

1