Towards Lexical Encoding of Multi-Word Expressions in Spanish Dialects. - Citegraph

Paper Info

Title
Towards Lexical Encoding of Multi-Word Expressions in Spanish Dialects.

Abstract
This paper describes a pilot study in lexical encoding of multi-word expressions (MWEs) in 4 Latin American dialects of Spanish: Costa Rican, Colombian, Mexican and Peruvian. We describe the variability of MWE usage across dialects. We adapt an existing data model to a dialect-aware encoding, so as to represent dialect-related specificities, while avoiding redundancy of the data common for all dialects. A dozen of linguistic properties of MWEs can be expressed in this model, both on the level of a whole MWE and of its individual components. We describe the resulting lexical resource containing several dozens of MWEs in four dialects and we propose a method for constructing a web corpus as a support for crowdsourcing examples of MWE occurrences. The resource is available under an open license and paves the way towards a large-scale dialect-aware language resource construction, which should prove useful in both traditional and novel NLP applications.

Year	Venue	Keywords
2016	LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION	multi-word expressions,lexical encoding,Spanish dialects
Field	DocType	Citations
Expression (mathematics),Crowdsourcing,Computer science,Redundancy (engineering),Natural language processing,Artificial intelligence,Data model,License,Encoding (memory)	Conference	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Diana Bogantes	1	0	0.34
Eric Rodríguez	2	0	0.34
Alejandro Arauco	3	0	0.34
Alejandro Rodríguez	4	0	0.34
Agata Savary	5	92	19.55

1