Title
Plant identification with noisy web data
Abstract
One of the main problems in image based plant identification has been the lack of quality training image data. A few attempts for solving this problem through generating high quality plant images from crowd sourced Web image collections like Flickr are proposed in this paper. These methods try to automatically identify correct and informative training images from those Web images, which typically have very noisy metadata (for example, user tags in Flickr), to enhance existing manually labeled training set. Firstly, for each plant, a set of images is collected from searching Flickr by using the plant name as the query. Then, images are clustered into visually consistent clusters, and in each cluster hopefully a majority of the images are all relevant or irrelevant to the particular plant. From these clusters, a managed plant image data set from ImageCLEF is used as reference to automatically select the highest quality cluster for each plant. The image quality of the selected clusters is further improved by two algorithms: an iterative method and image similarity based ranking. We show that the larger training data set automatically selected by this method significantly increases the accuracy of image based plant identification. In addition, this approach is a generic solution to almost all image recognition problems as long as additional (noisy) training data can be obtained from the Internet automatically.
Year
DOI
Venue
2014
10.1109/ICME.2014.6890180
Multimedia and Expo
Keywords
Field
DocType
Internet,agricultural engineering,image retrieval,iterative methods,object detection,object recognition,pattern clustering,social networking (online),Flickr,ImageCLEF,Internet,Web image collection,automatic informative training image identification,image based plant identification,image clustering,image quality,image recognition problem,image similarity based ranking,iterative method,labeled training set,noisy Web data,query processing,visually consistent cluster,Image classification,crowd sourced big data,machine learning,plant identification
Computer science,Image retrieval,Image quality,Artificial intelligence,Cluster analysis,The Internet,Computer vision,Metadata,Automatic image annotation,Information retrieval,Ranking,Pattern recognition,Plant identification
Conference
ISSN
Citations 
PageRank 
1945-7871
0
0.34
References 
Authors
5
2
Name
Order
Citations
PageRank
William Y. Zhang100.34
Xian-Sheng Hua26566328.17