• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Bachelor 2022/2023

Machine Learning

Area of studies: Economics
When: 4 year, 1-4 module
Mode of studies: distance learning
Online hours: 64
Open to: students of one campus
Instructors: Лазарева Маргарита Владимировна, Alexey Boldyrev, Maksim Karpov
Language: English
ECTS credits: 10
Contact hours: 112

Course Syllabus

Abstract

This course introduces the students to the elements of machine learning, including supervised and unsupervised methods such as linear and logistic regressions, splines, decision trees, support vector machines, bootstrapping, random forests, boosting, regularized methods and several topics in deep learning, such as artificial neural networks, recurrent neural networks, convolutional neural networks, transformers and attention mechanisms, auto-encoders, etc. The first two modules (Sep-Dec) DSBA and ICEF students apply Python programming language and popular packages, such as pandas, scikit-learn and TensorFlow, to investigate and visualize datasets and develop machine learning models that solve theoretical and data-driven problems. The next two modules (Jan-Jun) DSBA/ICEF students dive deeper into mathematical, statistical, and algorithmic concepts and studying deep neural networks. During the entire period of study, students participate in Kaggle competitions in groups, a widespread format in data science, aiming at developing soft skills such as collaborative work, solving a research task, meeting the deadlines. Pre-requisites: at least one semester of calculus on a real line, vector calculus, linear algebra, probability and statistics, computer programming in high level language such as Python.
Learning Objectives

Learning Objectives

  • The course aims to help students develop an understanding of the process to learn from data, familiarize them with a wide variety of algorithmic and model based methods to extract information from data, teach to apply and evaluate suitable methods to various datasets by model selection and predictive performance evaluation.
Expected Learning Outcomes

Expected Learning Outcomes

  • Build and interpret the data visualizations in Python and R programming language
  • Build features suitable for the selected machine learning models
  • Construct machine learning models on the proposed data sets in R
  • Evaluate performance of the models
  • Tune models to improve prediction and classification performance of the models
Course Contents

Course Contents

  • Math Essentials. Intro to Python in Google Colab
  • Intro to Statistical learning
  • Linear Regression (SLR) and K-Nearest Neighbors (KNN)
  • Classification with Logistic Regression, LDA, QDA, KNN
  • Resampling methods. CV, Bootstrap
  • Linear model selection & regularization
  • Non-linear regression
  • Decision Trees, Bagging, Random Forest, Boosting
  • Support Vector Machines/Classifiers
  • Clustering methods. PCA, k-Means, Hierarchical Clustering, DBSCAN
  • Artificial Neural Networks (ANN)
  • Convolutional Neural Networks (CNN)
  • Recurrent Neural Networks (RNN) and Long-Short Term Memory (LSTM) Networks
  • Transformer and Attention Layers
  • Reinforcement Learning
Assessment Elements

Assessment Elements

  • non-blocking Home assignments
    Home assignments. The grade for the current category is calculated as cumulative from the beginning of the course.
  • non-blocking Tests
    These are individualized, timed, (possibly) proctored and otherwise constrained tests to prevent cheating. In general, expect 60 questions in 60 minutes, some of which you may will have seen in quizzes. The assessment of the test is based on the marking scheme that comes with the exam assignment. Each problem and their sub parts are worth a certain number of points, the sum of these points is equal to 100, which is the maximum grade for the exam on the 100 point scale. The student is awarded the assigned number of points for the correct answer to each part of the question and partial credit may also be awarded. The grade for the current category is calculated as cumulative from the beginning of the course.
  • non-blocking Quizzes
    The grade for the current category is calculated as cumulative from the beginning of the course.
  • non-blocking Participation
    The grade for the current category is calculated as cumulative from the beginning of the course.
  • non-blocking Exams
    These are individualized, timed, (possibly) proctored and otherwise constrained tests to prevent cheating. In general, expect 60 questions in 60 minutes, some of which you may will have seen in quizzes. The assessment of the exam is based on the marking scheme that comes with the exam assignment. Each problem and their sub parts are worth a certain number of points, the sum of these points is equal to 100, which is the maximum grade for the exam on the 100 point scale. The student is awarded the assigned number of points for the correct answer to each part of the question and partial credit may also be awarded. The grade for the current category is calculated as cumulative from the beginning of the course.
  • non-blocking Exams
    These are individualized, timed, (possibly) proctored and otherwise constrained tests to prevent cheating. In general, expect 60 questions in 60 minutes, some of which you may will have seen in quizzes. The assessment of the exam is based on the marking scheme that comes with the exam assignment. Each problem and their sub parts are worth a certain number of points, the sum of these points is equal to 100, which is the maximum grade for the exam on the 100 point scale. The student is awarded the assigned number of points for the correct answer to each part of the question and partial credit may also be awarded. The grade for the current category is calculated as cumulative from the beginning of the course.
Interim Assessment

Interim Assessment

  • 2022/2023 2nd module
    0.1 * Participation + 0.3 * Home assignments + 0.2 * Exams + 0.2 * Quizzes + 0.2 * Tests
  • 2022/2023 4th module
    0.1 * Participation + 0.2 * Quizzes + 0.2 * Exams + 0.2 * Tests + 0.3 * Home assignments
Bibliography

Bibliography

Recommended Core Bibliography

  • Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani, & Maintainer Trevor Hastie. (2013). Type Package Title Data for An Introduction to Statistical Learning with Applications in R Version 1.0. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.28D80286

Recommended Additional Bibliography

  • Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (Vol. Second edition, corrected 7th printing). New York: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=277008

Authors

  • BOLDYREV ALEKSEY SERGEEVICH
  • KARPOV MAKSIM EVGENEVICH