Abstract | ||
---|---|---|
As machines have become more intelligent, there has been a renewed interest in methods for measuring their intelligence. A common approach is to propose tasks for which a human excels, but one that machines find difficult. However, an ideal task should also be easy to evaluate and not be easily game able. We begin with a case study exploring the recently popular task of image captioning and its limitations as a task for measuring machine intelligence. An alternative and more promising task is visual question answering, which tests a machine's ability to reason about language and vision. We describe a data set, unprecedented in size and created for the task, that contains more than 760,000 human-generated questions about images. Using around 10 million human-generated answers, researchers can easily evaluate the machines. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1609/aimag.v37i1.2647 | AI MAGAZINE |
DocType | Volume | Issue |
Journal | 37 | 1 |
ISSN | Citations | PageRank |
0738-4602 | 6 | 0.54 |
References | Authors | |
23 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
C. Lawrence Zitnick | 1 | 7321 | 332.72 |
Aishwarya Agrawal | 2 | 360 | 10.62 |
Stanislaw Antol | 3 | 356 | 10.61 |
Margaret Mitchell | 4 | 1450 | 65.37 |
Dhruv Batra | 5 | 2142 | 104.81 |
Devi Parikh | 6 | 2929 | 132.01 |