Applied Computer Vision

Prof. David Vernon
Carnegie Mellon University Africa in Rwanda
vernoncmu.edu


Course Description  |  Learning Objectives  |  Outcomes  |  Content  |  Lecture Notes  |  Course Textbook  |  Recommended Reading |  Software

Course Description

This course provides students with a solid foundation in the key elements of computer vision, emphasizing the practical application of the underlying theory. It focusses mainly on the techniques required to build robot vision applications but the algorithms can also be applied in other domains such as industrial inspection and video surveillance. A key focus of the course is on effective implementation of solutions to practical computer vision problems in a variety of environments using both bespoke software authored by the students and standard computer vision libraries.

The course covers optics, sensors, image formation, image acquisition & image representation before proceeding to the essentials of image processing and image filtering. This provides the basis for a treatment of image segmentation, including edge detection, region growing, and boundary detection, the Hough transform, and colour-based segmentation.

Building on this, the course then proceeds to deal with object detection and recognition in 2D, addressing template matching, interest point operators, gradient orientation histograms, the SIFT descriptor, and colour histogram intersection and back-projection.

The problem of recovery of 3D information is then addressed, introducing homogeneous coordinates and transformations, the perspective transformation, camera model, inverse perspective transformation, stereo vision, and epipolar geometry.

The interpretation of visual information in unstructured environments poses many problems. To deal with these, the course then addresses visual attention, clustering, grouping, and segmentation, building on Gestalt principles, before proceeding to deal with object detection, object recognition, and object categorization in both 2D and 3D.

Back to Top


Learning Objectives

After completing this course, students should be able to:

  1. Apply their knowledge of image acquisition, image processing, and image analysis to extract useful information from visual images.
  2. Design, implement, and document appropriate, effective, and efficient software solutions for a variety of real-world computer vision problems.
  3. Exploit standard computer vision software libraries in the development of these solutions.

Back to Top


Course Content

Overview of human and computer vision.

OpenCV and software development tools for course work.

Optics, sensors, and image formation.

Image acquisition and image representation.

Image processing

  • Point & neighbourhood operations
  • Image filtering
  • Convolution
  • Fourier transform
  • Morphological operations
  • Geometric operations

Segmentation

  • Region-based approaches
  • Binary thresholding
  • Connected component analysis
  • edge detection
  • Colour-based approaches and k-means clustering
Image features
  • Harris interest point operator
  • Difference of Gaussian interest point operators
  • SIFT feature descriptor
Object recognition
  • Template matching
  • Normalized cross-correlation
  • Chamfer matching
  • 2D shape features
  • statistical pattern recognition
  • Hough transform for parametric curves: lines, circles, and ellipses
  • Generalized Hough transform and extension to codeword features
  • Colour histogram matching and back-projection
  • Haar features and boosted classifiers
  • Histogram of Oriented Gradients (HOG) feature descriptor
Video image processing
  • Background subtraction
  • Object tracking
3D vision
  • Homogeneous coordinates and transformations
  • Perspective transformation
  • Camera model and inverse perspective transformation
Stereo vision and epipolar geometry

Optical flow

Visual attention

  • human attention
  • Saliency-based attention
  • Selective tuning
  • Top-down attention

Computer vision and machine learning


Lecture Notes

The lecture notes follow; notes will be added the week prior to the delivery of each lecture.

Lecture 1: Overview of human and computer vision  
Lecture 2: OpenCV and software development tools for course work 
Lecture 3: Optics, sensors, and image formation
Lecture 4: Image acquisition and image representation 
Lecture 5: Image processing: point & neighbourhood operations, image filtering, convolution, Fourier transform 
Lecture 6: Image processing: morphological operations 
Lecture 7: Image processing: geometric operations 
Lecture 8: Segmentation: simple region-based approaches, binary thresholding, connected component analysis 
Lecture 9: Segmentation: boundary-based approaches, edge detection; boundary detection; snakes 
Lecture 10: Segmentation: region-based approachescolour-based approaches; k-means clustering 
Lecture 11: Segmentation: region-based approaches graph cuts 
Lecture 12: Image features: Harris and Difference of Gaussian interest point operators 
Lecture 13: Image features: SIFT feature descriptor 
Lecture 14: Object recognition: template matching; normalized cross-correlation; chamfer matching 
Lecture 15: Object recognition: 2D shape features; statistical pattern recognition 
Lecture 16: Object recognition: Hough transform for parametric curves: lines, circles, and ellipses 
Lecture 17: Object recognition: generalized Hough transform; extension to codeword features 

Back to Top


Course Textbook

Szeliski, R. Computer Vision: Algorithms and Applications, Springer, 2010.

Back to Top


Recommended Reading

Borji, A. and Itti, L. (2013). "State-of-the-Art in Visual Attention Modeling", IEEE Transactions on Pattern Analysis and Machine intelligence, Vol. 35, No. 1, pp. 185-207.

Dawson-Howe, K. (2014). A Practical Introduction to Computer Vision with OpenCV, Wiley.

Hanbury, A. The Taming of the Hue, Saturation, and Brightness Colour Space, Proc. Computer Vision Winter Workshop (CVWW), Austria, 2002.

Kragic, D. and Vincze, M. (2010). "Vision for Robotics", Foundation and Trends in Robotics, Vol 1, No 1, pp 1-78.

Vernon, D. (1991). Machine Vision: Automated Visual Inspection and Robot Vision, Prentice-Hall.

Back to Top


Software Development Environment

Click here for a step-by-step guide to downloading, installing, and using the software required to run examples and complete the assignments.


Acknowledgments

The syllabus for this course drew inspiration from several sources. These include the following.

  • Course VO 4.0 376.054 Machine Vision and Cognitive Robotics given by Markus Vincze, Michael Zillich, and Daniel Wolf at Technische Universitat Wien.
  • Course 4BA10 Computer Vision given by David Vernon at Trinity College Dublin.
  • Course 4BA10 Computer Vision given by Kenneth-Dawson Howe at Trinity College Dublin.
  • Course on Computer Vision at VVV2017 by Francesca Odone, University of Genova.

David Vernon's Personal Website