Abstract | ||
---|---|---|
In this paper we study the problem of identifying systems that automatically inject non-personal messages in micro-blogging message streams, thus potentially biasing results of certain information extraction procedures, such as opinion-mining and trend analysis. We also study several classes of features, namely features based on the time of posting, the client used to post, the presence of links, the user interaction and the writing style. This last class of features, that we introduce here for the first time, is proved to be a top performer, achieving accuracy near the 90%, on par with the best features previously used for this task. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1007/978-3-642-24769-9_46 | EPIA '89 |
Keywords | Field | DocType |
non-personal message,best feature,trend analysis,writing style,micro-blogging message stream,certain information extraction procedure,biasing result,user interaction,top performer,last class,ugc,microblogging,spam,user generated content | User-generated content,Data mining,World Wide Web,Social media,Information retrieval,Computer science,Noisy text,Microblogging,Writing style,Information extraction | Conference |
Volume | ISSN | Citations |
7026 | 0302-9743 | 5 |
PageRank | References | Authors |
0.46 | 10 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Gustavo Laboreiro | 1 | 58 | 4.51 |
Luís Sarmento | 2 | 377 | 31.16 |
Eugénio Oliveira | 3 | 974 | 111.00 |