Abstract | ||
---|---|---|
The objective of this work1 is to automatically generate a large number of images for a specified object class (for ex- ample, penguin). A multi-modal approach employing both text, meta data and visual features is used to gather many, high-quality images from the web. Candidate images are obtained by a text based web search querying on the object identifier (the word penguin). The web pages and the images they contain are down- loaded. The task is then to remove irrelevant images and re-rank the remainder. First, the images are re-ranked using a Bayes posterior estimator trained on the text surrounding the image and meta data features (such as the image alter- native tag, image title tag, and image filename). No visual information is used at this stage. Second, the top-ranked images are used as (noisy) training data and a SVM visual classifier is learnt to improve the ranking further. The prin- cipal novelty is in combining text/meta-data and visual fea- tures in order to achieve a completely automatic ranking of the images. Examples are given for a selection of animals (e.g. camels, sharks, penguins), vehicles (cars, airplanes, bikes) and other classes (guitar, wristwatch), totalling 18 classes. The results are assessed by precision/recall curves on ground truth annotated data and by comparison to previ- ous approaches including those of Berg et al. (5) (on an additional six classes) and Fergus et al. (9). |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/TPAMI.2010.133 | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Keywords | Field | DocType |
Internet,meta data,query processing,search engines,support vector machines,visual databases,SVM visual classifier,Webpages,automatic ranking,cross-validation procedure,ground-truth annotated data,harvesting image databases,high-quality images,metadata features,multimodal approach,noisy training data,object identifier,precision-recall curves,text-based Web search querying,top-ranked images,visual features,word penguin,Weakly supervised,computer vision,image retrieval.,object recognition | Object identifier,Computer science,Image retrieval,Artificial intelligence,Contextual image classification,Classifier (linguistics),Metadata,Computer vision,World Wide Web,Information retrieval,Pattern recognition,Ranking,Support vector machine,Ground truth | Journal |
Volume | Issue | ISSN |
33 | 4 | 0162-8828 |
Citations | PageRank | References |
147 | 9.32 | 17 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Florian Schroff | 1 | 757 | 32.72 |
Antonio Criminisi | 2 | 6801 | 394.29 |
Andrew Zisserman | 3 | 45998 | 3200.71 |