Title
Clustering for closely similar recipes to extract spam recipes in user-generated recipe sites.
Abstract
Nowadays, many user-generated recipe sites are accessible on the internet. On user-generated recipe sites, however, are various spam recipe pages that describe closely similar recipes requiring special cooking equipment, with no preparation explanations. These spam recipes are not useful for users. In fact, they impede user's recipe searches. In this paper, we target closely similar recipes as a first step in extracting spam recipes. If user search results could be classified to identify closely similar recipes, user's recipe searches would be easier and more productive. Clustering tools of many kinds are proposed, but it is difficult to cluster closely similar recipes using only existing clustering tools because recipe sites have a unique page structure comprising a title, ingredients, directions (preparation instructions), and comments. The importance of words from each part differs. We propose a clustering method for user-generated recipe sites based on page structure and important words. Next, we conducted an experiment to measure the benefits of our proposed method. The result of experiment presents the benefits of our proposed method which classify the closely similar recipes.
Year
DOI
Venue
2015
10.1145/2837185.2837269
iiWAS
Field
DocType
Citations 
Data mining,World Wide Web,Computer science,Recipe,Cluster analysis,The Internet
Conference
9
PageRank 
References 
Authors
0.68
8
3
Name
Order
Citations
PageRank
Shunsuke Hanai190.68
Hidetsugu Nanba220031.95
Akiyo Nadamoto318934.24