Title
Multilingual plagiarism detection corpus
Abstract
The paper presents a system for generating multilingual corpora that can be used to determine performance of plagiarism detection systems. Implemented method uses parallel language corpora and because of its scalability, can be applied to any language. Authors have collected data from five parallel corpora and enabled corpus generation for Croatian, French, German, Spanish and Italian language.
Year
Venue
Keywords
2012
Opatija
copy protection,natural language processing,parallel languages,Croatian language,French language,German language,Italian language,Spanish language,corpus generation,multilingual corpora generation,multilingual plagiarism detection corpus,parallel language corpora
Field
DocType
ISBN
Parallel language,Plagiarism detection,Computer science,Parallel corpora,Language identification,Universal Networking Language,Corpus linguistics,Natural language processing,Artificial intelligence,Scalability,German
Conference
978-1-4673-2577-6
Citations 
PageRank 
References 
1
0.40
1
Authors
3
Name
Order
Citations
PageRank
Vedran Juricic110.40
Vanja Stefanec211.41
Sinisa Bosanac310.40