5LSH0 - Computer Vision and 3D Image Processing (2018)

Goal:
Learn the main components of the video content analysis, such as feature detection and description, object segmentation, object detection and tracking. Learn advanced object classification techniques based on the Deep Learning concept. Learn the basics of 3D multi-view geometry, 3D sensing principles and 3D model reconstruction. Learn the practical aspects of implementing the above described methods, by programming (C++, python, TensorFlow) a surveillance application, using a UAV drone as a data capturing device.

Content:
Content of the course is divided globally into two areas.

  1. Techniques for object detection and recognition, feature extraction and analysis, like SIFT and HOG. Furthermore, semantic level processing for understanding events and scenes, including human behavior. Furthermore, classification techniques for understanding objects and events. Modern classification like K-means and SVM (support vector machine) algorithms, evolving into basics of learning with neural nets. This part will gradually evolve to fundamentals and practical applications of Deep Learning.
  2. 3D processing based on the camera pinhole model, multi-view processing and calibration. Also registration of 3D datasets, 3D reconstruction models with TSDF, introduction to SLAM, RGB-Depth processing and specific algorithms like G2.0 and bundle adjustment. Finally, the 3D processing modules end with plane/object segmentation in 3D.

The programming assignments aim at applying the knowledge and algorithms (or parts of them) to provide the student a framework for experiments with video content understanding and 3D image-based modeling for surveillance applications.
The assignments are based on C++ / Python / TensorFlow programming.

Preknowledge
Advised: 5LSE0 Multimedia video coding and architectures

Schedule and location: 
All lectures in a single week at the end of August, about 40% is dedicated to computer exercises.
Full schedule: 
Date Time Room
27 aug 09.30-17.30 Flux 1.07
28 aug 09.30-17.30 Flux 1.07
29 aug 19.30-17.30 Flux 1.07
30 aug 09.30-17.30 Flux 1.07
31 aug 09.30-17.30 Flux 1.07
Slides: 
Module 01: Introduction to camera projection matrix and different sensor data modalities
Module 02: 3D reconstruction, data fusion and SLAM techniques
Module 03: Visual feature extraction
Module 04: Motion analysis and estimation
Module 05: Object-level content analysis - segmentation
Module 06: Object-level content analysis - tracking
Module 07: Data clustering and Classification
Module 08: Introduction to Convolutional Neural Networks
Module 09: Applying Deep Learning in practice
Module 10: Analysis applications (surveillance via UAV drone)