Postgraduate course
2020/2021
Discriminative Methods in Machine Learning
Type:
Elective course
Area of studies:
Computer and Information Scienc
Delivered by:
School of Data Analysis and Artificial Intelligence
Where:
Faculty of Computer Science
When:
2 year, 1 semester
Mode of studies:
offline
Instructors:
Attila Kertesz-Farkas
Language:
English
ECTS credits:
5
Contact hours:
36
Course Syllabus
Abstract
This course gives an introduction to the most popular discriminative and differentiable machine learning methods, which are used in supervised learning. After completing the study of the discipline, the PhD student should have knowledge about modern discriminative methods such as deep convolutional learning techniques, kernel machines, limitations of learning methods and standard definitions such as overfitting, regularization, etc., knowledge about ongoing developments in Machine Learning, hands-on experience with large scale machine learning problems, knowledge about how to design and develop machine learning programs using programming language Python, and be able to think critically with real data.
Learning Objectives
- The learning objective of the course “Discriminative methods in Machine Learning” is to pro-vide students advanced techniques and deeper theoretical and practical knowledge in modern discrimi-native learning techniques, such as: <ul> <li>Logistic regression, Support Vector Machines, regularization, Neural Networks, Deep Neural Networks, Limits on learning, Deep Learning techniques, Neural Turing Machines, Performance eval-uation techniques, optimization algorithms.</li> </ul>
Expected Learning Outcomes
- Students know basics of classification and decision making, performance evaluation, machine bias.
- Students know standard discriminative methods such as linear and logistic regression, neural networks, collaborative filtering, word embeddings, decision trees, and similiraty based inference.
- Students are introduced to similarity metrics and their concepts.
- Students know techniques related to deep neural network such as concolutional layers, rectified linear units, discussion on problems association to deep neural networks such as vanishing gradients.
- Students know discriminative methods for sequential data such as text.
- Students know differentiable systems, which learn from to operate memory access.
- Students know advanced methods for training and regularizing neural networks.
- Students are introduced to the theory of Machine Learning.
Course Contents
- Introduction to machine learning, Evaluation techniquesBasic definitions or machine learning, principles and types of machine learning, performance metrics, errors and type of errors. ROC characteristics. Machine bias.
- Basic methodsRegression, Logistic regression, Support Vector Machines, Neural Networks, Collaborative Filtering, K-nearest Neighbor, decision trees, random forests.
- Kernels and distance functionsKernel functions for real-valued vectors and for discrete models. Distance functions, edit distance, and information distance. Curse of dimensionality.
- Deep Neural NetworksAuto Encoders, deep neural networks, stacked auto encoders, convolutional layers and max-pooling. Deep data vs. wide data, universal approximators, word embeddings.
- Methods for sequential dataSequential data, Recurrent Neural Networks, and long-short term memory models.
- Neural Turing MachinesNeural Turing Machines and its applications.
- Optimization and RegularizationError Surfaces, Optimization and Regularization methods: stochastic gradient descent, momentum methods, polyak averaging, coordinate descent, adaptive learning rates, line-search, adaGrad, RMprop, Second order methods: Levenberg–Marquardt, Newton, conjugate gradients, Broyden–Fletcher–Goldfarb–Shanno. Regularization: parameter norm penalty, early stopping, data augmentation, sparse coding, mini batches vs sharp minima, batch normalization.
- Algorithm independent machine learning and No-free-lunch theoremsRegularization, overfitting-underfitting, bias-variance decomposition in model selection, model capacity, minimum description length, parameters and hyper parameters, other problems: missing values and class imbalance problem. Bootstrap and Jackknife estimations. No-Free-lunch theorems. Interpretability. Bias.
Assessment Elements
- Presence
- ExamWritten exam. Preparation time – 180 min.
The final exam will consist of a selection of problems equally weighted. No material is allowed for the exam. Each question will focus on a particular topic presented during the lectures.
The questions consist in exercises on any topic seen during the lectures. To be prepared for the final exam, PhD students must be able to answer questions from the topics covered during the lecture. - Presence
- ExamWritten exam. Preparation time – 180 min.
The final exam will consist of a selection of problems equally weighted. No material is allowed for the exam. Each question will focus on a particular topic presented during the lectures.
The questions consist in exercises on any topic seen during the lectures. To be prepared for the final exam, PhD students must be able to answer questions from the topics covered during the lecture.
Bibliography
Recommended Core Bibliography
- James, G. et al. An introduction to statistical learning. – Springer, 2013. – 426 pp.
Recommended Additional Bibliography
- Wainwright, M. J., & Jordan, M. I. (2008). Graphical Models, Exponential Families, and Variational Inference. Boston: Now Publishers. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=352768