Title
Technical Infrastructure at Linguistic Data Consortium: Software and Hardware Resources for Linguistic Data Creation
Abstract
Linguistic Data Consortium (LDC) at the University of Pennsylvania has participated as a data provider in a variety of government-sponsored programs that support development of Human Language Technologies. As the number of projects increases, the quantity and variety of the data LDC produces have increased dramatically in recent years. In this paper, we describe the technical infrastructure, both hardware and software, that LDC has built to support these complex, large-scale linguistic data creation efforts at LDC. As it would not be possible to cover all aspects of LDC's technical infrastructure in one paper, this paper focuses on recent development. We also report on our plans for making our custom-built software resources available to the community as open source software, and introduce an initiative to collaborate with software developers outside LDC. We hope that our approaches and software resources will be useful to the community members who take on similar challenges.
Year
Venue
Keywords
2010
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
human language technology,software development
Field
DocType
Citations 
Linguistic Data Consortium,Software deployment,Package development process,Computer science,Software peer review,Software construction,Computer hardware,Software quality,Linguistics,Software development,Social software engineering
Conference
1
PageRank 
References 
Authors
1.04
7
7
Name
Order
Citations
PageRank
Kazuaki Maeda113834.69
Haejoong Lee210523.68
Stephen Grimes3274.56
Jonathan Wright4175.95
Robert Parker5132.97
David Lee619521.40
Andrea Mazzucchi711.71