Title
A framework for crime data analysis using relationship among named entities
Abstract
Many crime reports are available online in various blogs and Newswire. Though manual annotation of these massive reports is quite tedious for crime data analysis, it gives an overall crime scenario of all over the world. This motivates us to propose a framework for crime data analysis based on the online reports. Initially, the method extracts the crime reports and identifies named entities. The intermediate sequence of context words between every consecutive pair of named entities is termed as a crime vector that provides relationships between the entities. The feature vectors for each entity pair are generated from these crime vectors using the Word2Vec model. The paper considers three different types of named entity pairs to facilitate the major crime data analysis task, and for each type, similarity between every pair of entities is measured using respective feature vectors. For each type of named entity pair, a separate weighted graph is generated with entity pairs as vertices and similarity score between them as the weight of the corresponding edge. Then, Infomap, a graph-based clustering algorithm, is applied to obtain optimal set of clusters of entity pairs and a representative entity pair of each cluster. Each cluster is labelled by the relationship, represented by the crime vector, of its representative entity pair. In reality, all the entity pairs in a cluster may not reflect contextual similarity with their representative entity pair. So the clusters are further partitioned into subclusters based on WordNet-based path similarity measure which makes the entity pairs in each subcluster more contextually similar compared to their original cluster. These subclusters provide us various statistical crime information over the time period. The method is experimented only using the crime reports related to crime against women in India. The experimental results demonstrate the effectiveness and superiority of the method compared to others for analysing the crime data.
Year
DOI
Venue
2020
10.1007/s00521-019-04150-8
Neural Computing and Applications
Keywords
DocType
Volume
Crime analysis, Online news, Entity recognition, Relation extraction, Paraphrase extraction, Graph-based clustering
Journal
32
Issue
ISSN
Citations 
12
0941-0643
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Priyanka Das101.01
Asit Kumar Das27316.06
Janmenjoy Nayak36010.40
Danilo Pelusi48115.27