Магистратура
2024/2025
Машинное обучение
Статус:
Курс обязательный (Современный социальный анализ)
Направление:
39.04.01. Социология
Кто читает:
Департамент социологии
Где читается:
Санкт-Петербургская школа социальных наук
Когда читается:
2-й курс, 2 модуль
Формат изучения:
без онлайн-курса
Охват аудитории:
для всех кампусов НИУ ВШЭ
Преподаватели:
Сироткин Александр Владимирович
Прогр. обучения:
Современный социальный анализ
Язык:
английский
Кредиты:
3
Course Syllabus
Abstract
Rapid developments of social networking sites, online media and other internet-generated data are making machine learning an essential analytical tool of social scientists and industrial analysts of social data. Nowadays, social researchers should not only be able to work with different types of data, such as textual or relational data, but should also have skills to interpret results obtained with complex mathematical algorithms. In this course, students will first get to know basic machine learning algorithms and their main advantages and limitations for social science goals. Second, they will obtain skills to work with machine learning software / codes. Third, by the end of the course all students will produce small-scale research project that may be used in their Master theses. This course focuses on the use of machine learning algorithms in the python and Jupyter Notebook.
Learning Objectives
- Learn algorithms and their main advantages and limitations for social science goals
- Obtain skills to work with machine learning software / codes
- Be able to work with different types of data, such as textual or relational data
Expected Learning Outcomes
- Analyze data with machine learning tools
- Analyze textual and numerical data
- Do textual preprocessing (lemmatization and tokenization)
- Present the resulting project in terms of machine learning
- Visualize results of the analysis
Course Contents
- Topic 1. Introduction to machine learning.
- Topic 2. Overview of mathematical formalism necessary for understanding of machine learning.
- Topic 3. Data preprocessing.
- Topic 4. Regression (overview models).
- Topic 5. Feature selection.
- Topic 6. Cluster analysis (Kmeans, Cmeans, Hierarchical clustering).
- Topic 7. Linear models of classification and regressions.
- Topic 8. KNN and SVM classification.
- Topic 9. Naïve Bayes classifier.
- Topic 10. Topic modeling.
- Topic 11. Decision trees.
Assessment Elements
- Homework
- Presentation_projectAn project is a written self-study on a topic offered by the teacher or by the student him/herself approved by teacher. The topic for project includes development of skills for critical thinking and written argumentation of ideas. An project should include clear statement of a research problem; include an analysis of the problem by using concepts and analytical tools within the subject that generalize the point of view of the author
Bibliography
Recommended Core Bibliography
- Gareth James, Daniela Witten, Trevor Hastie, & Robert Tibshirani. (2013). An Introduction to Statistical Learning : With Applications in R. Springer.
- Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968
Recommended Additional Bibliography
- A Tutorial on Machine Learning and Data Science Tools with Python. (2017). Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.E5F82B62