Title
Identifying High Value Opportunities for Human in the Loop Lexicon Expansion
Abstract
Many real world analytics problems examine multiple entities or classes that may appear in a corpus. For example, in a customer satisfaction survey analysis there are over 60 categories of (somewhat overlapping) concerns. Each of these is backed by a lexicon of terminology associated with the concern (e.g., “Easy, user friendly process” or ”Process confusing, too many handoffs”). These categories need to be expanded by a subject matter expert as the terminology is not always straight forward (e.g., “handoffs” may also include “ping-pong” and “hot potato” as relevant terms). But given that Subject Matter Expert time is costly, which of the 60+ lexicons should we expand first? We propose a metric for evaluating an existing set of lexicons and providing guidance on which are likely to benefit most from human-in-the-loop expansion. Using our ranking results we achieved ≈ 4 × improvement in impact when expanding the first few lexicons off our suggested list as compared to a random selection.
Year
DOI
Venue
2019
10.1145/3308560.3317305
Companion Proceedings of The 2019 World Wide Web Conference
Field
DocType
ISBN
World Wide Web,Customer satisfaction,Ranking,Terminology,Differential privacy,Information retrieval,Subject-matter expert,Computer science,Lexicon,User Friendly,Analytics
Conference
978-1-4503-6675-5
Citations 
PageRank 
References 
0
0.34
0
Authors
8
Name
Order
Citations
PageRank
Alfredo Alba1779.87
Chad DeLuca200.68
Anna Lisa Gentile320026.00
Daniel Gruhl42282434.45
Linda Kato5355.31
Chris Kau6102.40
petar ristoski725621.36
Steve Welch853.34