Title
Gender Classification Models and Feature Impact for Social Media Author Profiling
Abstract
Automatic profiling models infer demographic characteristics of social network users from their generated content or interactions. Due to its use in business (targeted advertising, market studies...), automatic user profiling from social networks has become a popular task. Users' demographic data is also crucial information for more socially concerning tasks, such as automatic early detection of mental disorders. For this type of users' analysis task, it has been demonstrated that the way users employ language is an essential indicator that contributes to the effectiveness of the models. For this reason, we also believe that considering the usage of the language from both psycho-linguistic and semantic characteristics it is useful for detecting variables such as gender, age, and user's origin. A proper selection of features will be critical for the performance of retrieval, classification, and decision-making software systems, a proper selection of features will be critical. In this work, we shall discuss gender classification as a part of the automated profiling task. We present an experimental analysis of the performance of existing gender classification models for automated profiling based on external corpus and baselines. We also investigate the role of linguistic characteristics in the model's classification accuracy and their impact on each gender. Following that analysis, we have developed a feature set for gender classification models in social networks that outperforms existing benchmarks in terms of accuracy.
Year
DOI
Venue
2021
10.1007/978-3-030-96648-5_12
EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING (ENASE 2021)
Keywords
DocType
Volume
Gender classification, Author profiling, Feature relevance, Social media
Conference
1556
ISSN
Citations 
PageRank 
1865-0929
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Paloma Piot-Perez-Abadin100.68
Patricia Martín-Rodilla200.68
Javier Parapar321.37