Master
2022/2023
Large Scale Machine Learning 2
Type:
Elective course
Area of studies:
Applied Mathematics and Informatics
Delivered by:
Big Data and Information Retrieval School
Where:
Faculty of Computer Science
When:
2 year, 2 module
Mode of studies:
distance learning
Online hours:
82
Open to:
students of one campus
Instructors:
Anatoly Bardukov
Master’s programme:
Магистр по наукам о данных (заочная)
Language:
English
ECTS credits:
4
Contact hours:
8
Course Syllabus
Abstract
This course focuses on future of ML Engineering. It starts from big data problems and classic models appliance, introduces approaches for text (NLP) and other data types (images, etc), and in the end presents the field of ML operations. Final project requires you to show the full cycle of ML workflow including data collection, training and deployment.
To complete the course, students are supposed to have skills in classical algorithms and data structures, main concepts of machine learning, and Python programming.
Learning Objectives
- After taking this course, students should be able to: ● work with large and high-dimensional datasets, ● work withtext data, ● use strategies for paralleling neural network learning, ● use different approaches for model optimization, ● plan the model deployment using different scenarios.
Expected Learning Outcomes
- Understand how to work with big data preparation for classic models’ training
- Big text data preparation, understand word to vec models
- Distributed training of neural networks, transfer learning
- Understand knowledge distillation; neural network prunning, quantization
- Dockerization of models
- Get familiar with MLflow
Course Contents
- Big Data Problems and Classic Models
- Text Models for Big Data
- Neural Network Models
- Model Optimization
- Machine Learning Models Deployment
- End-to-end Production pipeline for Machine Learning Models
Bibliography
Recommended Core Bibliography
- Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (Vol. Second edition, corrected 7th printing). New York: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=277008
Recommended Additional Bibliography
- Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968