Title
Providing consumers with a representative subset from online reviews.
Abstract
Purpose - The purpose of this paper is to find a representative subset from large-scale online reviews for consumers. The subset is significantly small in size, but covers the majority amount of information in the original reviews and contains little redundant information. Design/methodology/approach - A heuristic approach named RewSel is proposed to successively select representatives until the number of representatives meets the requirement. To reveal the advantages of the approach, extensive data experiments and a user study are conducted on real data. Findings - The proposed approach has the advantage over the benchmarks in terms of coverage and redundancy. People show preference to the representative subsets provided by RewSel. The proposed approach also has good scalability, and is more adaptive to big data applications. Research limitations/implications - The paper contributes to the literature of review selection, by proposing a heuristic approach which achieves both high coverage and low redundancy. This study can be applied as the basis for conducting further analysis of large-scale online reviews. Practical implications - The proposed approach offers a novel way to select a representative subset of online reviews to facilitate consumer decision making. It can also enhance the existing information retrieval system to provide representative information to users rather than a large amount of results. Originality/value - The proposed approach finds the representative subset by adopting the concept of relative entropy and sentiment analysis methods. Compared with state-of-the-art approaches, it offers a more effective and efficient way for users to handle a large amount of online information.
Year
DOI
Venue
2017
10.1108/OIR-05-2016-0125
ONLINE INFORMATION REVIEW
Keywords
Field
DocType
Online reviews,Redundancy,Coverage,Heuristic approach,Representative subset
Data mining,World Wide Web,Heuristic,Information retrieval,Computer science,Sentiment analysis,Originality,Redundancy (engineering),Big data,Kullback–Leibler divergence,Scalability
Journal
Volume
Issue
ISSN
41.0
6.0
1468-4527
Citations 
PageRank 
References 
2
0.38
30
Authors
4
Name
Order
Citations
PageRank
Jin Zhang135347.68
Ming Ren240.76
Xian Xiao31087.50
Jilong Zhang420.38