Abstract | ||
---|---|---|
The paper presents a system for generating multilingual corpora that can be used to determine performance of plagiarism detection systems. Implemented method uses parallel language corpora and because of its scalability, can be applied to any language. Authors have collected data from five parallel corpora and enabled corpus generation for Croatian, French, German, Spanish and Italian language. |
Year | Venue | Keywords |
---|---|---|
2012 | Opatija | copy protection,natural language processing,parallel languages,Croatian language,French language,German language,Italian language,Spanish language,corpus generation,multilingual corpora generation,multilingual plagiarism detection corpus,parallel language corpora |
Field | DocType | ISBN |
Parallel language,Plagiarism detection,Computer science,Parallel corpora,Language identification,Universal Networking Language,Corpus linguistics,Natural language processing,Artificial intelligence,Scalability,German | Conference | 978-1-4673-2577-6 |
Citations | PageRank | References |
1 | 0.40 | 1 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Vedran Juricic | 1 | 1 | 0.40 |
Vanja Stefanec | 2 | 1 | 1.41 |
Sinisa Bosanac | 3 | 1 | 0.40 |