Title
Practical computer vision: example techniques and challenges
Abstract
Humans, as well as many living organisms, are gifted with the power of “seeing” and “understanding” the environment around them using their eyes. The ease with which humans process and understand the visual world is very deceiving and often prompts us to underestimate the effort and methods needed to build practical, effective, and inexpensive computer vision systems. In essence, humans have a 500-million-year head start due to evolution; it is extremely difficult at this point to build a computer vision system that has the abilities of a three-year-old child. However, by confining ourselves to particular domains, we can often find shortcuts to solve particular problems. This paper illustrates a number of such solutions in various areas developed by our group at IBM. These include object finding for video surveillance, person identification via biometrics, inspection of manufactured items along railways, and scene understanding for driver assistance, as well as object recognition and motion interpretation for retail stores. We discuss the real-world constraints for each system and describe how we overcame the irksome variability inherent in each task. By further analyzing such successful systems and comparing them to each other, we can come to understand the common underlying problems and thus start to extend our initially limited areas of competence into a more general-purpose vision toolkit. This paper concludes with a set of challenging unresolved problems that if solved could spur great progress in practical computer vision.
Year
DOI
Venue
2011
10.1147/JRD.2011.2165676
IBM Journal of Research and Development
Keywords
Field
DocType
500-million-year head,object recognition,inexpensive computer vision system,humans process,particular domain,practical computer vision,general-purpose vision toolkit,computer vision system,example technique,successful system,particular problem
Computer vision,IBM,Computer science,Artificial intelligence,Biometrics,Cognitive neuroscience of visual object recognition,Vision science
Journal
Volume
Issue
ISSN
55
5
0018-8646
Citations 
PageRank 
References 
5
0.79
17
Authors
10
Name
Order
Citations
PageRank
Sharath Pankanti13542292.65
Brown, L.224013.87
J. Connell350.79
A. Datta450.79
Quanfu Fan550432.69
Rogério Feris6152989.95
N. Haas771.19
Yingshu Li837323.07
N. Ratha951.12
Hoang Trinh10497.09