Title
Machine Learning Support for EU Funding Project Categorization
Abstract
European Union reallocates its money to their member states using different kinds of funding. EU member states categorize EU funding projects using their own categorization system. While EU prepared an integrated European categorization system, many EU members do not use it in their reports. This hinders a straightforward fiscal analysis. The article aims at an automatic support for categorization of EU funding projects by Machine Learning. The experiments showed that Support Vector Machines (SVM) is the top performance Machine Learning algorithm for this task. We experimented with the SVM classifier and the results disclosed that by employing this approach we can classify EU funding projects using a lexical description better than a baseline (i.e. the classification to a major class). Further, we experienced that the approach using the natural language translator outperforms the approach using the word sense disambiguation. Finally, we investigated the influence of the length of project description on the performance of the classifier. The results showed that while there was a positive correlation between the length of project description and the classifier performance for project descriptions in English, in the case of project description in Non-English languages the classifier performed better for shorter project descriptions. In future, we plan to build a new online application which would use the classifier on the back-end and a user would get a category recommendation on the front-end using a visualization of the EU categorization system.
Year
DOI
Venue
2019
10.1093/comjnl/bxz021
COMPUTER JOURNAL
Keywords
Field
DocType
EU funding projects,open fiscal data,categorization,RDF code list,machine learning
Data science,Categorization,Computer science,Theoretical computer science
Journal
Volume
Issue
ISSN
62
11
0010-4620
Citations 
PageRank 
References 
0
0.34
0
Authors
1
Name
Order
Citations
PageRank
Ondřej Zamazal15014.82