Лучший по критерию «Полезность курса для Вашей будущей карьеры»

Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»

Кто читает:: Международная лаборатория прикладного сетевого анализа

Статус:: Курс по выбору

Когда читается:: 1-й курс, 3 модуль

Преподаватель

Мальцева Дарья Васильевна

Full Syllabus Ask Question

Abstract

Machine learning is implemented within the field of statistical learning theory and is basically drawn from statistics and functional analysis. The goal of the course is to study, in a statistical framework, the properties of learning algorithms. This study serves a two-fold purpose. On one hand it provides strong guarantees for existing algorithms, and on the other hand suggests new algorithmic approaches that are potentially more powerful. In this course we will go in detail into the theory and methods of statistical learning, and in particular complexity regularization (i.e., how do you choose the complexity of your model when you have to learn it from data). This issue is at the heart of the most successful and popular machine learning algorithms today, and it is critical for their success. This course is an elective course and is implemented both with R and Python.

Learning Objectives

The course gives students an important foundation to develop and conduct their own research as well as to evaluate research of others.

Expected Learning Outcomes

Be able to apply the basic concepts from machine learning theory
Be able to identify appropriately the type of a machine learning problem at hand, e.g. classification, regression, clustering
Be able to differentiate between supervised and unsupervised learning methods, understand their benefits and limitations
Be able to master theoretical understanding of key methods for supervised learning to apply decision trees, linear regression, logistic regression, quantile regression, variations of regression for non-Gaussian distributions of the target variable
Be able to differentiate and correctly apply most common approaches to ensemble learning (random forests, gradient boosting, stacking, blending, etc.) as well as to explain their benefits and limitations
Be able to identify and tackle issues related to overfitting and model instability
Be able to apply basic tools and approaches to automated text processing as well as to incorporate text data into machine learning solutions
Be able to systematize and prioritize best practices in experiment tracking and sustainable ML development

Course Contents

Section 1. Introduction to classification and regression problems in machine learning.
Section 2. Model evaluation, key metrics for classification and regression.
Section 3. Text mining.
Section 4. Ensemble learning.
Section 5. Unsupervised learning.
Section 6. Association rules: theory and applications.
Section 7. Advanced regression analysis.
Section 8. Model explainability.
Section 9. Basics of machine learning development.

Assessment Elements

Graded quizzes
Mid-term homework
Final project

Interim Assessment

2024/2025 3rd module
0.5 * Final project + 0.3 * Graded quizzes + 0.2 * Mid-term homework

Bibliography

Recommended Core Bibliography

Harman, G., & Kulkarni, S. (2007). Reliable Reasoning : Induction and Statistical Learning Theory. Cambridge, Mass: A Bradford Book. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=189264
Haroon, D. (2017). Python Machine Learning Case Studies : Five Case Studies for the Data Scientist. [Berkeley, CA]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1623520
Kulkarni, S., Harman, G., & Wiley InterScience (Online service). (2011). An Elementary Introduction to Statistical Learning Theory. Hoboken, N.J.: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=391376

Recommended Additional Bibliography

Lantz, B. (2019). Machine Learning with R : Expert Techniques for Predictive Modeling, 3rd Edition (Vol. Third edition). Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2106304
Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968
Ramasubramanian, K., & Singh, A. (2017). Machine Learning Using R. [Place of publication not identified]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1402990
Sarkar, D., Bali, R., & Sharma, T. (2018). Practical Machine Learning with Python : A Problem-Solver’s Guide to Building Real-World Intelligent Systems. [United States]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1667293

Authors

Pavlova Irina Anatolevna
Ващенко Василиса Андреевна

Магистерская программа «Аналитика данных и прикладная статистика / Data Analytics and Social Statistics»

Machine Learning