TweetMT: A Parallel Microblog Corpus. - Citegraph

Paper Info

Title
TweetMT: A Parallel Microblog Corpus.

Abstract
We introduce TweetMT, a parallel corpus of tweets in four language pairs that combine five languages (Spanish from/to Basque, Catalan, Galician and Portuguese), all of which have an official status in the Iberian Peninsula. The corpus has been created by combining automatic collection and crowdsourcing approaches, and it is publicly available. It is intended for the development and testing of microtext machine translation systems. In this paper we describe the methodology followed to build the corpus, and present the results of the shared task in which it was tested.

Year	Venue	Field
2016	LREC	Catalan,Computer science,Crowdsourcing,Machine translation,Artificial intelligence,Natural language processing,Corpus linguistics,Social media,Portuguese,Microblogging,Text corpus,Speech recognition,Linguistics
DocType	Citations	PageRank
Conference	1	0.43
References	Authors
9	9

Authors (9 rows)

Cited by (1 rows)

References (9 rows)

Name	Order	Citations	PageRank
Iñaki San Vicente	1	31	5.80
Iñaki Alegria	2	231	32.35
Cristina España-Bonet	3	46	14.35
Pablo Gamallo	4	139	29.27
Hugo Gonçalo Oliveira	5	127	27.72
Eva Martínez Garcia	6	8	3.66
Antonio Toral	7	116	10.00
Arkaitz Zubiaga	8	564	42.96
Nora Aranberri	9	20	6.94

1