Title
Age and Gender Identification in Unbalanced Social Media
Abstract
Nowadays, people share a lot of information in social media in the form of videos, news, photos, posts, likes, etc. This large amount of generated information reflects the opinions, emotions and preferences of the users. As an example of the previous, Pinterest is a popular social network where the users show their interests in the form of pins, which are information units formed by a short text comment and an image. In this research, we study the problem of building a model to characterize users of Pinterest with two demographic variables, age and gender, using their textual information post in the network. To do that, we introduce a dataset formed by the texts in English from 548,761 pins corresponding to 264 users. This dataset is imbalanced and reflects the actual distribution of the social network for gender and age, with a dominant presence of women over men, and of middle age persons over young persons. With this dataset, we conducted experiments with a diversity of machine learning models, a variety of features and considering a set of performance metrics. Our results produce interesting insights about the problem.
Year
DOI
Venue
2019
10.1109/CONIELECOMP.2019.8673125
2019 International Conference on Electronics, Communications and Computers (CONIELECOMP)
Keywords
Field
DocType
Pins,Feature extraction,Task analysis,Machine learning,Twitter,Writing
World Wide Web,Social media,Social network,Task analysis,Textual information,Computer science,Control engineering,Feature extraction,Middle age
Conference
ISSN
ISBN
Citations 
2474-9036
978-1-7281-1145-2
0
PageRank 
References 
Authors
0.34
0
4