Title
Text documents streams with improved incremental similarity
Abstract
There has been a significant effort by the research community to address the problem of providing methods to organize documentation, with the help of Information Retrieval methods. In this paper, we present several experiments with stream analysis methods to explore streams of text documents. This paper also presents possible architectures of the Text Document Stream Organization, with the use of incremental algorithms like Incremental Sparse TF-IDF and Incremental Similarity. Our results show that with this architecture, significant improvements are achieved, regarding efficiency in grouping of similar documents. These improvements are important since it is of general knowledge that great amounts of text analysis are a high dimensional and complex subject of study, in the data analysis area.
Year
DOI
Venue
2021
10.1007/s13278-021-00826-z
SOCIAL NETWORK ANALYSIS AND MINING
Keywords
DocType
Volume
Incremental sparse TF-IDF, Data streams, Text streams, Incremental similarity, Text documents networks
Journal
11
Issue
ISSN
Citations 
1
1869-5450
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Rui Portocarrero Sarmento100.34
Douglas de O. Cardoso2355.19
Kemmily Dearo300.34
Pavel Brazdil41214143.56
João Gama53785271.37