2020/2021




Машинное обучение в Питоне
Статус:
Факультатив
Когда читается:
1, 2 модуль
Охват аудитории:
для всех кампусов НИУ ВШЭ
Язык:
английский
Кредиты:
4
Контактные часы:
56
Course Syllabus
Abstract
This course introduces the students to the elements of machine learning, including supervised and unsupervised methods such as linear and logistic regressions, splines, decision trees, support vector machines, bootstrapping, random forests, boosting, regularized methods and much more. The two modules (Sept-Dec, 2020) use Python programming language and popular packages to investigate and visualize datasets and develop machine learning models.
Learning Objectives
- Develop an understanding of the process to learn from data
- Familiarize students with a wide variety of algorithmic and model based methods to extract information from data
- Teach students to apply and evaluate suitable methods to various datasets by model selection and predictive performance evaluation
Course Contents
- Linear Regression
- Intro to Statistical learning
- All topics in Machine Learning in Python course
- k-Nearest Neighbors
- Classification: logistic regression
- Classification: LDA, QDA, KNN
- Resampling methods. CV, Bootstrap
- Linear model selection & regularization
- Non-linear regression-2
- Non-linear regression
- Decision Trees
- Bagging, Random Forest, Boosting
- Support Vector Machines/Classifiers
- Clustering methods. PCA, k-Means, HC
- Special Topics: tSNE, UMAP, Neural Networks
Assessment Elements
- HomeworkWeekly graded homework (HW) assignments, which will include analysis of datasets, analytical and conceptual problems, and programming assignments. These are to be completed individually.
- In-Canvas Quizzes (Q)In-Canvas Quizzes (Q) are based on lectures, slides, and textbooks. Answers can only be submitted once and cannot be seen thereafter, so please check them carefully before submitting. Questions are shuffled and sampled for each student. So, students will likely see different questions
- Participation (P)Participation (P): this includes your active participation in the course, answering questions of your peers in the Piazza forum, and your attendance of seminars and lectures. Redundant and uninformative posts (for the sake of traffic) may lower participation grade. Please leave meaningful questions and comments. All participation is tracked by Zoom software and Piazza.
- Exams (E1 + E2)Exams (E): There will be exams at the end of each of the 4 modules. The examination locations are TBD. An in-class exam is closed book, notes, calculators and phones. Take-home exam is an open book/internet, but no collaboration. Exam questions are different from homework questions: HW deepens your understanding, but the exams measure it. Each exam is cumulative. Do not book travel that conflicts with this date. Automatic grading policy for Exam 2: If grade to the date of exam 2 (G2E1) ≥ 95% and exam 1 grade ≥ 95%, then G2E1 is used as the grade for exam 2.
Interim Assessment
- Interim assessment (2 module)0.5 * Exams (E1 + E2) + 0.35 * Homework + 0.1 * In-Canvas Quizzes (Q) + 0.05 * Participation (P)
Bibliography
Recommended Core Bibliography
- Gareth James, Daniela Witten, Trevor Hastie, & Robert Tibshirani. (2013). An Introduction to Statistical Learning : With Applications in R. Springer.
- Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (Vol. Second edition, corrected 7th printing). New York: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=277008
Recommended Additional Bibliography
- Mehryar Mohri, Afshin Rostamizadeh, & Ameet Talwalkar. (2018). Foundations of Machine Learning, Second Edition. The MIT Press.