5LSM0 - Convolutional neural networks for computer vision

Fueled by the exponential increase in computational power and the vastly increasing amount of data, deep learning has emerged as a powerful, alternative approach to the traditional machine learning methods. Especially in the field of computer vision, strong progress has been made on automatic recognition of image content using deep Convolutional Neural Networks (CNNs). Tasks that were unimaginable only a decade ago, are now relatively easily implemented using such CNNs. This course introduces this end-to-end machine learning approach for the automatic interpretation of visual content: image classification, semantic segmentation, object detection and more. The course program addresses both the theoretical underpinnings and the practical implementation of convolutional neural networks, thereby offering an essential toolset for computer vision scientists in a wide variety of applications domains, ranging from medical imaging to surveillance.

General course information
The course starts with the data-driven approach for image classification and discussing several loss functions for quantification and optimization of the classification result. Next, Neural Networks are introduced, addressing its key techniques such as backpropagation, multi-layer perceptrons and activation functions. In addition, the relation between convolutional filtering and convolutional networks is explained. As the training process of neural networks is crucial to their success, all important aspects that are involved with training the network are elaborated on, which include initialization, dropout and batch normalization, update rules, data augmentation and transfer learning. Recurrent Neural Networks (RNNs) are introduced for temporal modeling for e.g. image captioning and language modeling. With this theoretical background, several application domains are explored such as detection, segmentation and visualization. Familiarity with popular software tools such as PY Tortch and different network architectures is acquired by means of a number of practical exercises. The course ends with a brief outlook to promising emerging techniques for deep learning such as generative models (variational auto-encoders a Generative Adversarial Networks (GANs)) and deep reinforcement learning.


  • The student is able to explain the differences between conventional machine learning and deep (end-to-end) learning and list the most important aspects of both.
  • The student understands the mechanics and objectives of the basic building blocks of a convolutional neural network.
  • The student can summarize several popular network architectures and describe their specific advantages and disadvantages.
  • The student is able to implement CNNs for simple visual recognition tasks (e.g. handwritten digit classification).
  • The student is able to estimate the network performance with appropriate metrics and validation.
  • The student can monitor the training behavior of a CNN and is able to explain typical phenomena such as overfitting, convergence and learning rate.
Module 01: Introduction to computer vision
Module 02: Data-driven image classification
Module 03: Loss functions and optimization
Module 04: Neural networks and backpropagation
Module 05: Convolutional neural networks
Module 06: CNN architectures
Module 08: Supervised Learning
Module 09: Unsupervised Learning
Module 10: Semi-supervised Learning
Module 11: Sequence Modeling and Reinforcement Learning
Module 12: Visualization and Understanding
Module 13: Efficient deep learning