• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Master 2023/2024

Machine Learning and Data Mining

Type: Compulsory course (Data Science)
Area of studies: Applied Mathematics and Informatics
When: 2 year, 1, 2 module
Mode of studies: offline
Open to: students of all HSE University campuses
Master’s programme: Data Science
Language: English
ECTS credits: 6
Contact hours: 54

Course Syllabus

Abstract

The course "Machine Learning and Data Mining"; introduces students tonew and actively evolving interdisciplinary field of modern data analysis.Started as a branch of Artificial Intelligence, it attracted attention ofphysicists, computer scientists, economists, computational biologists,linguists and others and become a truly interdisciplinary field of study. Inspite of the variety of data sources that could be analyzed, objects andattributes that from a particular dataset poses common statistical andstructural properties. The interplay between known data and unknown ones giverise to complex pattern structures and machine learning methods that are thefocus of the study. In the course we will consider methods of Machine Learningand Data Mining. Special attention will be given to the hands-on practicalanalysis of the real world datasets using available software tools and modernprogramming languages and libraries.
Learning Objectives

Learning Objectives

  • To familiarize students with a new rapidly evolving filed of machine learning and mining, and provide practical knowledge experience in analysis of real world data.
Expected Learning Outcomes

Expected Learning Outcomes

  • Students derive the bias-variance decomposition for MSE and “0-1” losses, and show how regularization affects the tradeoff.
  • Students explain and utilize the black-box optimization techniques.
  • Students explain the main approaches to graphical probabilistic models and training of them.
  • Students explain the relation between linear models and deep neural networks, describe how neural networks are trained, and understand what the role of data scientist is in designing a deep learning solution to a machine learning problem.
  • Students know meta-learning approaches.
Course Contents

Course Contents

  • Introduction to Machine Learning and Data Mining, No-Free-Lunch theorems
  • Bias-variance decomposition, regularization techniques
  • Introduction to meta-algorithms, bootstrap, boosting
  • Introduction and overview of deep learning methods
  • Deep generative models: Generative Adversarial Networks (GANs)
  • Optimization techniques: black-box methods, first order methods
  • Miscellaneous topics: imbalanced datasets, importance sampling, one-class classification methods
  • Deep generative models: energy-based models, Boltzmann machines and deep belief networks
  • Deep generative models: Variational AutoEncoders
  • Meta-learning: concept learning, learning how to learn
Assessment Elements

Assessment Elements

  • non-blocking Homeworks
  • non-blocking Exam
Interim Assessment

Interim Assessment

  • 2023/2024 2nd module
    0.5 * Exam + 0.5 * Homeworks
Bibliography

Bibliography

Recommended Core Bibliography

  • Hall, M., Witten, Ian H., Frank, E. Data Mining: practical machine learning tools and techniques. – 2011. – 664 pp.
  • Han, J., Kamber, M., Pei, J. Data Mining: Concepts and Techniques, Third Edition. – Morgan Kaufmann Publishers, 2011. – 740 pp.
  • Hastie, T., Tibshirani, R., Friedman, J. The elements of statistical learning: Data Mining, Inference, and Prediction. – Springer, 2009. – 745 pp.

Recommended Additional Bibliography

  • Mirkin, B. Core concepts in data analysis: summarization, correlation and visualization. – Springer Science & Business Media, 2011. – 388 pp.

Authors

  • ROGACHEV ALEKSANDR IGOREVICH
  • AL-MAEENI ABDALAZIZ RASHID KHALID
  • Антропова Лариса Ивановна