Combining domain-specific heuristics for author name disambiguation - Citegraph

Paper Info

Title
Combining domain-specific heuristics for author name disambiguation

Abstract
Author name disambiguation has been one of the hardest problems faced by digital libraries since their early days. Historically, supervised solutions have empirically outperformed those based on heuristics, but with the burden of having to rely on manually labelled training sets for the learning process. Moreover, most supervised solutions just apply some type of generic machine learning solution and do not exploit specific knowledge about the problem. In this paper, we follow a similar reasoning, but in the opposite direction. Instead of extending an existing supervised solution, we propose a set of carefully designed heuristics and similarity functions and apply supervision only to optimize such parameters for each particular dataset. As our experiments show, the result is a very effective, efficient and practical author name disambiguation method that can be used in many different scenarios.

Year	DOI	Venue
2014	10.1109/JCDL.2014.6970165	Digital Libraries
Keywords	Field	DocType
data analysis,digital libraries,learning (artificial intelligence),author name disambiguation,dataset,digital libraries,domain-specific heuristics,generic machine learning solution,heuristics,similarity functions,supervised solutions,Name Disambiguation,Supervised Methods	Training set,Information retrieval,Author name,Computer science,Exploit,Heuristics,Digital library,Name disambiguation	Conference
ISSN	ISBN	Citations
2575-7865	978-1-4799-5569-5	4
PageRank	References	Authors
0.45	15	4

Authors (4 rows)

Cited by (4 rows)

References (15 rows)

Name	Order	Citations	PageRank
Alan Filipe Santana	1	20	1.44
Marcos André Gonçalves	2	20	1.15
Alberto H. F. Laender	3	45	2.32
Anderson Ferreira	4	4	0.45

1