Convolutional neural networks for computer vision
5LSM0 (2021)
Groups
Automotive, Healthcare, Surveillance, Animal care
Introduction
Fueled by the exponential increase in computational power and the vastly increasing amount of data, deep learning has emerged as a powerful, alternative approach to the traditional machine-learning methods. Especially in the field of computer vision, strong progress has been made on automatic recognition of image content using deep Convolutional Neural Networks (CNNs). Tasks that were unimaginable only a decade ago, are now relatively easily implemented using such CNNs. This course introduces this end-to-end machine learning approach for the automatic interpretation of visual content: image classification, semantic segmentation, object detection, and more. The course program addresses both the theoretical underpinnings and the practical implementation of CNNs, thereby offering an essential toolset for computer vision scientists in a wide variety of application domains, ranging from medical imaging to surveillance.
Course details
The course starts with the data-driven approach for image classification and discussing several loss functions for quantification and optimization of the classification result. Next, Neural Networks are introduced, addressing its key techniques such as backpropagation, multi-layer perceptrons, and activation functions. In addition, the relation between convolutional filtering and convolutional networks is explained. As the training process of neural networks is crucial to their success, all important aspects that are involved with training the network are elaborated on, which include initialization, dropout and batch normalization, update rules, data augmentation and transfer learning. Recurrent Neural Networks (RNNs) are introduced for temporal modeling for e.g. image captioning and language modeling. With this theoretical background, several application domains are explored such as detection, segmentation, and visualization. Familiarity with popular software tools such as PyTorch and different network architectures is acquired by means of a number of practical exercises. The course ends with a brief outlook to promising emerging techniques for deep learning, such as generative models (variational auto-encoders and Generative Adversarial Networks) and deep reinforcement learning.
Aims:
- The student is able to explain the differences between conventional machine learning and deep (end-to-end) learning and list the most important aspects of both
- The student understands the mechanics and objectives of the basic building blocks of a CNN
- The student can summarize several popular network architectures and describe their specific advantages and disadvantages
- The student is able to implement CNNs for simple visual recognition tasks (e.g. handwritten digit classification)
- The student is able to estimate the network performance with appropriate metrics and validation
- The student can monitor training behavior of a CNN and is able to explain typical phenomena such as overfitting, convergence, and learning rate
Lecturer:
- Dr. Fons van der Sommen
Schedule and locations:
Slides
- Module 01 - Introduction to computer vision
- Module 02 - Data-driven image classification
- Module 03 - Loss functions and optimization
- Module 04 - Neural networks and backpropagation
- Module 05 - Convolutional neural networks
- Module 06 - CNN architectures
- Module 07 - Training neural networks (part 1)
- Module 07 - Training neural networks (part 2)
- Module 08 - Supervised learning
- Module 09 - Unsupervised learning
- Module 10 - Beyond supervised learning
- Module 11 - Sequence modeling and reinforcement learning
- Module 12 - Visualization and understanding
- Module 13 - Efficient deep learning
- Summary