Abstract | ||
---|---|---|
Text clustering is known as the problem of grouping texts so that all texts within a group share a similar measure (similar author, similar genre, etc.) This task became very important for the last two decades because of increased number of text documents in digital form which needs to be organized and processed. We investigate a new metric based on the Feature Relation Graph (FRG) for Arabic handwritten texts clustering. This metric has proved to be effective for the text independent Persian writer identification. We have used it to solve more general problem of texts clustering. Pattern based features are extracted from handwritten texts using Gabor and XGabor filters. The extracted features are represented for each cluster by using the FRG. We apply several clustering algorithms in a space of FRGs. Numerical experiments to demonstrate effectiveness of proposed metric and compare effectiveness of different algorithms are provided. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/ICDAR.2015.7333900 | International Conference on Document Analysis and Recognition |
Field | DocType | ISSN |
Graph,Arabic,Pattern recognition,Document clustering,Persian,Computer science,Natural language processing,Artificial intelligence,Cluster analysis | Conference | 1520-5363 |
Citations | PageRank | References |
0 | 0.34 | 9 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Vladislav A. Pavlov | 1 | 0 | 0.34 |
Dmitry S. Shalymov | 2 | 6 | 2.90 |