Statistical Learning Theory

Bachelor 2023/2024

Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'

Category 'Best Course for New Knowledge and Skills'

Type: Elective course (Applied Mathematics and Information Science)

Area of studies: Applied Mathematics and Information Science

Delivered by: Big Data and Information Retrieval School

Where: Faculty of Computer Science

When: 3 year, 1, 2 module

Mode of studies: offline

Open to: students of all HSE University campuses

Instructors: Bruno Frederik Bauwens, Artur Goldman, Maxim Kaledin

Language: English

ECTS credits: 5

Contact hours: 56

Full Syllabus Ask Question

Abstract

This course studies mathematical explanations for the learning ability of important machine learning algorithms such as support vector machines, AdaBoost, and over parameterized neural nets. The course has 3 parts. The first is about "online learning". Important ideas and techniques are introduced in a simpler probability free setting, such as margins and bias-complexity trade-off. In the lectures about multi-armed bandits, probability theory is reviewed, (also this is important in reinforcement learning). The 2nd part, present the main definitions (VC-dimension and Rademacher complexity), and prove the main risk bounds, (using various measure concentration results). In the 3rd part, the theory is used to derive margin risk bounds. This is applied to SVM-classification, AdaBoost, and implicit regularization in over parameterized neural nets. Neural tangent kernels are introduced and the connection with kernel SVM is explained. See HTTP://wiki.cs.hse.ru/Statistical_learning_theory_2023 for the full lecture notes.

Learning Objectives

Understand theoretical explanations for the success of drop-out regularization in neural nets.
Know various measures of model capacity, such as VC-dimension and Demarche complexity. Use them to derive risk bounds and bounds on sample complexity. Both in the realizable and the agnostic setting.
Apply important results from measure concentration, such as the Chernozem bound, Holding's bound, and the bounded differences inequality.
Prove upper and lower bounds on the number of mistakes made by basic algorithms in online learning, such as the weighted majority algorithm, perception algorithms. Similar for learning with expert advice and multi-armed bandits.

Expected Learning Outcomes

Understand theoretical explanations of the success of (kernel)- SVM and AdaBoost using margin risk bounds. Be able to prove positive definiteness of kernels. Understand which kernels allow for good risk bounds.
Optimize hyper parameters using risk bounds (instead of using a validation set). We do this for a few boosting algorithms
Know what is the double-descent phenomena when using stochastic gradient descent in neural nets. Understand why overparameterization leads to implicit regularization

Course Contents

Philosophy
The online mistake model
The agnostic online mistake model.
Perception algorithm
Prediction with expert advice
Multi-armed bandits
Sample complexity
VC-dimension
Risk bounds using VC-dimension
Measure concentration
Risk bounds using Rademacher complexity
Margin risk bounds for SVM
Kernels
AdaBoost
Implicit regularization of stochastic gradient descent in wide neural nets
Double descent phenomena phenomena
Evolution of neural tangent kernels during training

Assessment Elements

Colloquium
In December. The student receives 2 questions from a list of questions, given 2 weeks in advance. Answers are prepared in writing and the teacher checks understanding of the answers by asking for details or to work out simple examples. Students may not consult anything
Homework
Every seminar list contains 2 homework questions similar as the questions in the seminar. Every 14 days, 4 questions need to be submitted. Sometimes the HW contains bonus marks, they are added to the score of the exam
Examination
During session in December. A written examination is held past Module 1. Students may use the major textbooks and the lecture notes given in the wiki. The exam happens in a computer room. There are 4 questions similar to the one of the seminars and the homework.
Quizzes

Interim Assessment

2023/2024 2nd module
0.35 * Colloquium + 0.3 * Examination + 0.35 * Homework + 0 * Quizzes

Bibliography

Recommended Core Bibliography

Foundations of machine learning, Mohri, M., 2012

Recommended Additional Bibliography

Introduction to machine learning, Alpaydin, E., 2020

Authors

LUKIANENKO NIKITA SERGEEVICH
BAUVENS Bruno Frederik L
PODOLSKIY VLADIMIR VLADIMIROVICH

Course Syllabus