• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Магистерская программа «Аналитика данных и прикладная статистика / Data Analytics and Social Statistics»

Machine Learning

2024/2025
Учебный год
ENG
Обучение ведется на английском языке
3
Кредиты
Лучший по критерию «Полезность курса для Вашей будущей карьеры»
Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»
Статус:
Курс по выбору
Когда читается:
1-й курс, 3 модуль

Преподаватель

Course Syllabus

Abstract

Machine learning is implemented within the field of statistical learning theory and is basically drawn from statistics and functional analysis. The goal of the course is to study, in a statistical framework, the properties of learning algorithms. This study serves a two-fold purpose. On one hand it provides strong guarantees for existing algorithms, and on the other hand suggests new algorithmic approaches that are potentially more powerful. In this course we will go in detail into the theory and methods of statistical learning, and in particular complexity regularization (i.e., how do you choose the complexity of your model when you have to learn it from data). This issue is at the heart of the most successful and popular machine learning algorithms today, and it is critical for their success. This course is an elective course and is implemented both with R and Python.
Learning Objectives

Learning Objectives

  • The course gives students an important foundation to develop and conduct their own research as well as to evaluate research of others.
Expected Learning Outcomes

Expected Learning Outcomes

  • Be able to apply the basic concepts from machine learning theory
  • Be able to identify appropriately the type of a machine learning problem at hand, e.g. classification, regression, clustering
  • Be able to differentiate between supervised and unsupervised learning methods, understand their benefits and limitations
  • Be able to master theoretical understanding of key methods for supervised learning to apply decision trees, linear regression, logistic regression, quantile regression, variations of regression for non-Gaussian distributions of the target variable
  • Be able to differentiate and correctly apply most common approaches to ensemble learning (random forests, gradient boosting, stacking, blending, etc.) as well as to explain their benefits and limitations
  • Be able to identify and tackle issues related to overfitting and model instability
  • Be able to apply basic tools and approaches to automated text processing as well as to incorporate text data into machine learning solutions
  • Be able to systematize and prioritize best practices in experiment tracking and sustainable ML development
Course Contents

Course Contents

  • Section 1. Introduction to classification and regression problems in machine learning.
  • Section 2. Model evaluation, key metrics for classification and regression.
  • Section 3. Text mining.
  • Section 4. Ensemble learning.
  • Section 5. Unsupervised learning.
  • Section 6. Association rules: theory and applications.
  • Section 7. Advanced regression analysis.
  • Section 8. Model explainability.
  • Section 9. Basics of machine learning development.
Assessment Elements

Assessment Elements

  • non-blocking Graded quizzes
  • non-blocking Mid-term homework
  • non-blocking Final project
Interim Assessment

Interim Assessment

  • 2024/2025 3rd module
    0.5 * Final project + 0.3 * Graded quizzes + 0.2 * Mid-term homework
Bibliography

Bibliography

Recommended Core Bibliography

  • Harman, G., & Kulkarni, S. (2007). Reliable Reasoning : Induction and Statistical Learning Theory. Cambridge, Mass: A Bradford Book. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=189264
  • Haroon, D. (2017). Python Machine Learning Case Studies : Five Case Studies for the Data Scientist. [Berkeley, CA]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1623520
  • Kulkarni, S., Harman, G., & Wiley InterScience (Online service). (2011). An Elementary Introduction to Statistical Learning Theory. Hoboken, N.J.: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=391376

Recommended Additional Bibliography

  • Lantz, B. (2019). Machine Learning with R : Expert Techniques for Predictive Modeling, 3rd Edition (Vol. Third edition). Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2106304
  • Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968
  • Ramasubramanian, K., & Singh, A. (2017). Machine Learning Using R. [Place of publication not identified]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1402990
  • Sarkar, D., Bali, R., & Sharma, T. (2018). Practical Machine Learning with Python : A Problem-Solver’s Guide to Building Real-World Intelligent Systems. [United States]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1667293

Authors

  • Pavlova Irina Anatolevna
  • Ващенко Василиса Андреевна