• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Магистерская программа «Аналитика данных и прикладная статистика / Data Analytics and Social Statistics»

Summary of Degree Programme

Field of Studies

01.04.02 Applied Mathematics and Informatics

Approved by
The academic council meeting protocol № 2.6-06.28/250724-9 dated 25.07.2024
HSE University Educational Standard
Last Update
25.07.2024
Network Programme

No

Length of Studies, Mode of Studies, Credit Load

2 years

Full-time, 120

Language of instruction

ENG

Instruction in English

Qualification upon graduation

Master

Double-degree Programme

No

Use of online learning

Прикладная статистика с методами сетевого анализа (2023):

Аналитика данных и прикладная статистика (2024): Online programme

Tracks

2024/2025 Academic year

Computational Social and Network Sciences

Type: General
Language of instruction: Russian and English
Use of online learning: Online programme
Qualification upon graduation: Магистр
Key learning outcomes:

KER-1 Capable of applying modern methodology, methods, and tools for data analysis, including machine learning

KER-2 Knows the basics, terminology, principles, and methodology of data analytics for various fields of activity

KER-3 Masters basic and advanced methods of applied statistics, data analysis and machine learning, can work with data of various nature

KER-4 Knows basic algorithms and data structures, knows how to apply them to solve practical problems and evaluate the effectiveness of proposed solutions

Description of the professional field:

PC-1 Able to apply advanced methods of applied statistics and high-tech technologies to model complex systems (networks)

PC-2 Capable of exploring big data and systems (including complex networks and information systems) using methods of system analysis, mathematical modeling and forecasting, as well as computational experiments using high-performance computing tools, etc.

PC-3 Capable of constructing, analyzing, justifying, validating and optimizing models in order to ensure a given level of quality in the interests of developing design solutions for academic and business problems

PC-4 Able to create, analyze and maintain data and knowledge bases 

PC-5 Able to apply and optimize network analysis methods for research of socio-economic processes

Description of educational modules:
With the introduction and spread of new digital technologies, the penetration of the Internet and social networks into the lives of a large number of people, researchers are receiving more and more information about the actions and interactions in various social groups, teams and society as a whole. Such 'digital traces' turn into 'big data', the storage and processing of which becomes possible thanks to the development of computer computing power and the creation of advanced and fast algorithms and data analysis tools. As a result, human society – the traditional object of study of social sciences – can be viewed from a new, previously inaccessible perspective. An interdisciplinary scientific field that uses rigorous computational methods to analyze and model various social processes and phenomena is called Computational social sciences. Unlike 'traditional' social sciences, which operate on the concept of sampling units of analysis, computational social sciences work with large volumes of available data – complex, rapidly changing and not clearly structured – studying the entire set of objects of interest. The use of advanced analytical tools, such as deep learning and natural language processing, allows to identify hidden patterns in human behavior, and computer modeling makes it possible to test various hypothetical situations that might occur in social systems. All together, this allows to take a fresh look at what society is and how it works. Important information that becomes available when studying 'digital traces' is information about interactions between members of a social system, which makes it possible to study social networks of relations between various subjects belonging to different levels of analysis – people, organizations, countries, etc. Attention to relational connectivity and dependence between units of analysis is fundamental in another rapidly developing interdisciplinary field – Network science. And if the study of social relations in the concepts of nodes, connections and networks is not new in the social sciences, then it is digital technologies that allow to reach new levels of analysis of social systems, making it possible to study complex networks in dynamics and their modeling using advanced algorithms.

The track 'Computational Social and Network Sciences' of the online master’s programme 'Data Analytics and Social Statistics' allows students to gain deeper knowledge of modern trends and theoretical and methodological developments in the field of current trends at the intersection of exact and social sciences, based on the collection and analysis of large volumes data with unprecedented breadth, depth and scale. As part of the courses, students are introduced to the mathematical apparatus and methods of statistical analysis, the use of methods and tools of computer science for collecting, processing data and modeling, as well as the use of the theoretical and methodological apparatus of social sciences to formulate research design, interpretation and presentation of results ('Programming in R and Python', 'Data Mining', 'Applied Linear Models', 'Multivariate Data Analysis', 'Unstructured Data Analysis', 'Structural Equation Modelling', etc.). A separate opportunity of the track is taking several courses on network analysis, which allow you to immerse yourself in this research area from scratch ('Introduction to SNA', 'Advanced SNA in Pajek', 'Statistical methods in Network analysis', 'Social network analysis with R', Research seminar 'Working with network data'). Students will become familiar with the implementation of network analysis using the universal programming languages R and Python, as well as in the programs Pajek, RSiena, Gephi, Orange, etc. The combination of acquired knowledge and skills is carried out within the framework of the corresponding Research seminar 'Computational Social and Network Sciences').

The track will be of interest to both students with a basic education in the social sciences and humanities, who will be able to gain knowledge in the field of working with big data and their advanced analysis, and students with a basic education in the exact sciences, who will be able to gain the skills necessary for research in fields of sociology, psychology, political science, economics, linguistics and other social sciences. The track is suitable for students with different basic education who want to develop in the field of network analysis, a new disciplinary area for Russian practice. Graduates of this track of the program will be able to work in the research industry in the field of applied social research, applying advanced methods to study various social phenomena and processes. Since the track pays special attention to design issues and the specifics of conducting social research, its graduates will be able to continue working in an academic environment, if desired, enrolling in PhD programs or graduate school.

 

Applied Statistics and Data Science

Type: Applied
Language of instruction: Russian and English
Use of online learning: Online programme
Qualification upon graduation: Магистр
Key learning outcomes:

KER-1 Capable of applying modern methodology, methods, and tools for data analysis, including machine learning

KER-2 Knows the basics, terminology, principles, and methodology of data analytics for various fields of activity

KER-3 Masters basic and advanced methods of applied statistics, data analysis and machine learning, can work with data of various nature

KER-4 Knows basic algorithms and data structures, knows how to apply them to solve practical problems and evaluate the effectiveness of proposed solutions

 

Description of the professional field:

PC-1 Able to apply advanced methods of applied statistics and high-tech technologies to model complex systems (networks)

PC-2 Capable of exploring big data and systems (including complex networks and information systems) using methods of system analysis, mathematical modeling and forecasting, as well as computational experiments using high-performance computing tools, etc.

PC-3 Capable of constructing, analyzing, justifying, validating and optimizing models in order to ensure a given level of quality in the interests of developing design solutions for academic and business problems

PC-4 Able to create, analyze and maintain data and knowledge bases PC-5 Able to apply and optimize network analysis methods for research of socio-economic processes

Description of educational modules:
There are many data science jobs available in the job market right now. With the exponential growth of information, data scientists are becoming indispensable to companies in all industries. In the coming years, the field of data science will develop dynamically, and the search for interesting projects and work will become competitive, and employers will become more demanding of the competencies of applicants. The basis of modern data analysis is applied statistics. Applied statistics allows one to apply advanced methods of mathematical statistics and process statistical data to analyze various areas of society using computer data processing. The technical knowledge and skills that enhance this direction are provided by Data Science, or Data Science, as a field that combines sections of computer science related to data: collection, processing, analysis and making effective decisions. Combining the two directions makes it possible to analyze large volumes of unstructured data – that complex information about modern society that becomes available to the researcher thanks to new information technologies. That’s why our track is called 'Applied Statistics and Data Science.'Data scientists are constantly searching for new solutions and hypotheses for business. Data analysts require not only the ability to present data systematically, but also a creative approach to visually displaying information, a desire and willingness to look deep into the problem, find the questions underlying it, and formulate them into a testable set of hypotheses. It is important to remember that data analysis is primarily about research. You can explore data at different levels: automate the analysis process, formulate many hypotheses and test them using various methods. Data analysts in modern business help analyze key metrics, solve operational problems and achieve strategic goals. Companies and organizations of all sizes in different industries have a demand for specialists who can manage data flows and find valuable information in them. Specialists in this field must have a good knowledge of statistics and have knowledge in the relevant subject area. Knowledge of statistical methods is enhanced by skills from computer science.
The request for specialists in the field of data analysis is relevant not only for companies, but is also naturally included in the national strategic agenda for technological development (national projects 'Data Economy' and 'Artificial Intelligence'). In particular, it is the national project 'Data Economy' that confirms and updates the demand for specialists in the field of statistics and data analysis. It is important to be able to plan the economic development of individual industries, regions and cities, as well as to effectively and proactively structure the work of any organization to achieve results as quickly as possible.

The main components for the Applied Statistics and Data Science track are training in working with data and learning advanced statistical methods. It is important to understand how to extract information from different sources, as well as how to further work with this information, obtaining statistically accurate and reliable results that can be used in making business decisions. Study of a variety of advanced statistical methods of data analysis (Contemporary Methods of Data Analysis, Bayesian Statistics, Stochastic Models, Time Series, Network Analysis and many others) and computer methods of data processing and analysis (Data Mining, Machine Learning, Programming in R and Python, Unstructured Data Analysis and others) will be completed with the implementation of applied projects. Having mastered modern methods of data analysis, you can solve problems in the most efficient way, processing data arrays in software products that require different levels of user participation. Students of the program will become familiar with programs, packages and databases for the full cycle of working with data: R, Python, SAS, STATA, Orange, Pajek, Gephi, and others. Students will learn to work with algorithms that can receive, process data, calculate and make decisions. Knowledge of mathematical statistics, skills in testing hypotheses and estimating unknown parameters are complemented by a deep understanding of how current research is conducted in business, including using artificial intelligence technologies.

The 'Applied Statistics and Data Science' track is suitable for students who want to develop in the field of data analysis – a current and in-demand area in any subject area. Basic education can be anything: the track will be of interest to students with both an education in the social sciences and an education in the exact sciences. On the one hand, students on the track will be able to systematize and deepen their knowledge in the field of social sciences, and on the other hand, master data research skills in order to perform data analytics tasks and more effectively manage teams of data analysts and data scientists.
Studying on this track will allow students to acquire the necessary knowledge and skills for employment in various companies and corporations. Graduates of the 'Applied Statistics and Data Science' track can work as data analysts or product analysts in various fields, solving both research problems and applied problems in managing products and processes within their organizations. If desired, track graduates will be able to continue working in an academic environment by enrolling in PhD programmes or graduate school.

 

2023/2024 Academic year

Network Analysis

Type: General
Track Supervisor: Klimov, Ivan A.
Language of instruction: English
Use of online learning: With online tools
Qualification upon graduation: Магистр
Competitive Advantages

There is a shortage of specialists in applied statistics, especially in the area of ​​social network analysis. At the same time, training in the field of statistics is carried out in different ways: the majority of educational programmes in this area belong to the field of economics and focus mainly on mathematical methods; in the field of sociology, the study of statistics is limited to the study of probability theory and introductory courses.

This programme is unique because it is the first programme in Russia to offer a comprehensive approach to data analysis in different areas. As part of the programme, students from different disciplines can come together to solve practical analytical problems. Those mathematically inclined gain an understanding of sociology and the object of research, while those with a background in the humanities will be able to build their skillset and gain a deeper understanding of statistical processes making up the data analysis that we teach. In addition, a special focus of the programme will be the analysis of social networks , a direction of data science that is becoming increasingly popular in foreign and Russian research practice.

Another important characteristic of the programme is its applied nature - students do not learn from abstract theoretical constructs, but rather from dealing with specific applied research questions. Students will be able to apply their knowledge by solving practical problems, working at the International Laboratory for Applied Network Analysis, Russian analytical centers and commercial companies.

Professional Activities and Competencies of Programme Graduates

The knowledge and skillset obtained by graduates of the programme will render them skilled practitioners, able to apply advanced complex data analysis techniques working in a range of organizations - both in commercial companies operating in various industries (banking, insurance, consulting, IT, medicine, pharmacy), and in research organizations (sociology, marketing). The main competencies of the graduates of the program are:

General professional competencies:

  • Is able to apply a systematic approach in setting objectives and choosing approaches to the solution, as well as to take into account conflicting goals and needs and demands.
  • Is able to correctly use existing and introduce new concepts in the field of mathematics and informatics, integrate known facts, concepts, principles and theories related to applied mathematics and informatics.
  • Is able to reasonably select and apply modern computer technologies to solve professional tasks, including operating systems, network technologies, programming languages, languages of data manipulation, digital libraries, application packages.
  • Is able to communicate with specialists in the field of mathematical models and information technologies, as well as with experts from applied fields using various formal languages and notations.
  • Is able to build mathematical models and use them in solving applied problems in accordance with the direction of training and specialization.

Professional competencies:

  • Is able to organize research activities.
  • Is able to create computer programs using models and algorithms of applied mathematics
  • Is able to assess the correctness and reproducibility of applied mathematics and informatics methods
  • Is able to maintain collective scientific communication, organize scientific events.
  • Is able to organize the training of specialists in the field of applied mathematics in new methods and tools in accordance with the direction of training and specialization.
  • Is able to analyze and reproduce the meaning of interdisciplinary texts using the language and apparatus of applied mathematics and informatics.
  • Is able to create interdisciplinary texts using the language and apparatus of applied mathematics and informatics.
  • Is able to formalize and present publicly the results of professional activity using information technologies.
  • Is able to carry out a targeted multi-criteria search for information on the latest scientific and technological advances on the Internet and in other sources.
  • Is able to create, describe and responsibly control the implementation of technological requirements and regulations in professional activities
  • Is able to collect, clean, analyze and visualize large data

Universal competencies

  • Is able to reflect (evaluate and process) the learned scientific methods and ways of activity.
  • Is able to develop new theories, invent new ways and tools of professional activity.
  • Is able to independently master new research methods, change the scientific and production profile of its activities
  • Is able to improve and develop their intellectual and cultural level, build a track of professional development and career.
  • Is able to make management decisions and ready to take responsibility for them
  • Is able to analyze, verify, evaluate the completeness of information in the course of professional activities, if necessary, to fill in and synthesize missing information.
  • Is able to organize and manage multilateral communication.
  • Is able to conduct research activities in the international environment.
Programme Modules

Required Core Courses

This M.S. is based on the newly created graduate courses in statistical theory and methods taught by the faculty of the laboratory and renown Russian and international faculty. All candidates for this degree must take “Contemporary Data Analysis: Methodology of Interdisciplinary Research” and “Contemporary Decision Sciences Methods: an Integrated Perspective.” These two courses lay the foundation for the systems thinking that this program aims to develop. “Contemporary Data Analysis: Methodology and Methods of Interdisciplinary Research” is designed as a "gateway" to graduate work in statistics, where the mathematical concepts are bridged with applied concepts and research design, depending on the discipline. “Contemporary Decision Sciences Methods: an Integrated Perspective” provides a unified perspective that is aimed at developing improved decision-making process, where one needs to understand how decisions are made in practice and in what ways behavior differs from guidelines implied by normative theories of choice. 
“Applied Linear Models” serves as a foundational course for all mathematical thinking in applied statistics and subsequent courses taken in the program. Statistical Consulting, an equivalent to which does not appear to be offered by competing programs, is designed to establish firm foundations for working with “someone else’s” data, extracting relevant information from it, and preparing easy-to-understand reports for accurate use by clients with no statistical knowledge. Foundational and advanced courses in network analysis are focused on developing the critical analytical skills necessary for working with network data – the emphasis of this program. Required courses will be offered every year, to be taken by incoming new students only.

Additional Requirements

Given the emphasis of the laboratory on network statistical analysis, the lab is offering an emphasis on network methods. Students interested in obtaining a specialization in network methods must take additional courses in networks.
Students not wishing to pursue the network component are welcome to choose from the remaining course offerings. All students must select a total of required elective courses (offered by the programme and from the MagoLego university pool of courses) 

  1. Elective courses will be offered according to the study plan, and may be taken by both Year 1 and Year 2 students.
  2. As needed, courses could be offered to be taught by invited professors from internationally recognized research universities.
  3. Some courses may not be offered in any given two-year period if there is not enough student interest or demand in opening the course. Exact number of students, required to open the course, will be determined by the program director, depending on the total number of students enrolled in the program. Сourse may not be offered even if there are enough interested students if such students do not meet prerequisite requirements for the course.
  4. Regardless of the student enrollment numbers, a choice of courses will be offered to provide students with selection between more and less advanced courses from the standpoint of mathematics. 

Programme Courses:
1.    Contemporary Data Analysis: Methodology of Interdisciplinary Research  
a.     Prerequisites: One statistics course at the undergraduate level.
b.     Required course
This foundational course is designed to put together a unified research program for people from diverse disciplines. Its main purpose is to provide students with a firm foundation of research methodology, including topics in research design, theory building and testing with hypotheses generation, and advanced academic writing topics. This course is about conducting research, both in academia and in practice. Specifically, the students will focus on basic steps of the scientific inquiry, starting with the topic selection, and progressing through to literature review, hypotheses generation, choice of analysis method, and methods of propagating the research results to wide audiences (written and oral presentations). Whether they plan to work in the corporate world, or develop career in academia, they will be forced to generate knowledge and disseminate it to others, so there is no doubt that they will use the skills acquired in this course.
 
2.     Contemporary Decision Sciences Methods: an Integrated Perspective
a.     Prerequisites: intro to stats or consent of the instructor
b.     Required Course
This course is designed as an overview of a range of problems and applications to managerial decision making using scientific and analytical methodology. Topics include concepts and applications of the decision support system, including type of decisions, type of decision makers, modeling decisions, decisions within organizations, rule based expert systems, and simulation as a DSS application. This course also covers practical issues in DSS such as using Integer and Linear Programming as applications of modeling and solving choices and uncertainties of real world decision problems. Topics covered also include sensitivity analysis and an introduction to decision analysis. Problem recognition, model building, model analysis and managerial implications are the primary objectives with special emphasis on understanding the concepts and computer implementation and interpretation.

3.     Probability Theory
a.     Prerequisites: intro to stats or consent of the instructor
b.     Required Course
This course covers standard introductory probability theory topics such as probability spaces, discrete and continuous random variables, transformations, expectations, generating functions, conditional distributions, law of large numbers, central limit theorems, as well as advanced topics that are likely to be the most useful to someone planning to use research from the modern theory of stochastic processes in their daily work. The course has an applied component, with real-life applications examples of probability theory.
 
4.     Nonparametric Theory and Data Analysis  
a.     Prerequisites: Two statistics courses at the graduate level, or consent of instructor.
b.     Optional course
The course is an introduction to statistics outside of the "classical" techniques. Over and above the material itself, the course is useful for reinforcement of and elaboration on concepts of testing and estimation seen in classical courses, and serves as a bridge to modern, computationally intensive branches of statistics like machine learning. Topics covered include statistical functionals, bootstrapping, empirical likelihood. Nonparametric density and curve estimation. Rank and permutation tests.
 
5.     Bayesian Theory and Data Analysis  
a.     Prerequisites: Two statistics courses at the graduate level, or consent of instructor.
b.     Optional course
The course covers an introduction to the theory and practice of Bayesian inference. Topics covered include: Prior and posterior distributions, Bayes theorem, model formulation, Bayesian computation, model checking and sensitivity analysis. This is a general class on Bayesian methods. Some basic knowledge of probability distributions, calculus and linear algebra is assumed. We will examine Bayesian inference and prediction for simple parametric models, regression models, hierarchical models and mixture models that span a wide variety of applied data settings.  In each of these areas, we will compare and contrast the Bayesian and classical viewpoints for data analysis. We will develop a wide range of methods for model implementation, including optimization algorithms and Markov chain Monte Carlo simulation techniques. We will also examine strategies for model evaluation and validation.
Course participants will have interest in applied data analysis as well as basic knowledge of principles for statistical inference and prediction.  Participants should also have experience with basic probability topics, such as probability density functions, marginal and conditional probabilities, as well as transformation and simulation of random variables.  We will be implementing our models using the statistical software package R, though prior experience with R is not required for the course.

6.     Applied Linear Models  
a.     Prerequisites: intro to statistics and linear algebra (or equivalent courses), or consent of instructor.
b.     Required Course
An advanced course in applied statistics, Linear models will be used to treat a wide range of regression and analysis of variance methods. Topics include: matrix review; multiple, curvilinear, nonlinear, and stepwise regression; correlation; residual analysis; model building; use of the regression computer packages; use of indicator variables for analysis of variance and covariance models. The first part of the course will emphasize linear regression and the analysis of variance, including topics from the design of experiments and culminating in the general linear model. 
 
7.     Categorical Data Analysis  
a.     Prerequisites: Two statistics courses at the graduate level, or consent of instructor.
b.     Optional Course
The analysis of cross-classified categorical data. Loglinear models; regression models in which the response variable is binary, ordinal, nominal, or discrete. Logit, probit, multinomial logit models; logistic and Poisson regression.
This class focuses on the basic regression models for categorical dependent variables. While advances in software have made it simple to estimate these models, post-estimation interpretation is difficult due to the nonlinearities of the models. The class begins by considering the general objectives for interpreting the results of any regression type model and then considers why achieving these objectives is more difficult with nonlinear models. Basic concepts and notation are introduced through a review of the linear regression model. Within this familiar context, the method of maximum likelihood estimation is presented. These ideas are used to develop the logit and probit models for binary outcomes. A variety of practical methods for interpreting nonlinear models are presented. The models and methods of interpretation for binary outcomes are extended to ordinal outcomes using the ordinal logit and probit models. The multinomial logit model for nominal outcomes is then discussed. Finally, a series of models for count data, including Poisson regression, negative binomial regression, and zero modified models are presented.
 
8.     Multilevel Models  
a. Prerequisites: Two statistics courses at the graduate level, or consent of instructor.
b. Optional Course
This course is designed to provide students with a training experience in the concept and application of multilevel statistical modeling.  You will be motivated to think about correlated and dependent data structures that arise due to sampling design and/or are inherent in the population (such as pupils nested within schools; patients nested within clinics; individuals nested within neighborhoods and so on).  The substantive purpose of this course is to enable quantitative assessments on the role of contexts (e.g., schools, clinics, neighborhoods) in predicting individual outcomes.  This will be accomplished by developing a range of multilevel models along with a detailed discussion of the statistical properties and the interpretation of each model. 

9.  Structural Equation Modeling  
a.    Prerequisites: Two statistics courses at the graduate level, or consent of instructor.
b.    Required Course
Path analysis. Introduction to multivariate multiple regression, confirmatory factor analysis, and latent variables. Structural equation models with and without latent variables. Mean-structure and multi-group analysis.
This course is designed for students and faculty who would like to acquire a significant familiarity with statistical techniques known collectively as "structural equation modeling," "causal modeling," or "analysis of covariance structures."  As learning in this course demands basic understanding of statistical principles and techniques such as regression and factor analysis, the course will start with an overview of basic applied statistics and linear algebra, and will progress to more complex models in a sequential manner. The goals of the course are:  To ensure that students understand topics and principles of applied statistical techniques; to provide students with an understanding of the basic principles of latent variable structural equation modeling and lay the foundation for future learning in the area; to explore the advantages and disadvantages of latent variable structural equation modeling, and how it relates to other methods of analysis; to develop student familiarity, through hands-on experience, with the major structural equation modeling programs, so that they can use them and interpret their output; to develop and/or foster critical reviewing skills of published empirical research using structural equation modeling.
 
10.  Time Series Analysis  
a.     Prerequisites: Two statistics courses at the graduate level, or consent of instructor.
b.     Optional Course
Techniques for analyzing data collected at different points in time. Probability models, forecasting methods, analysis in both time and frequency domains, linear systems, state-space models, intervention analysis, transfer function models and the Kalman Filter. Topics also include: Stationary processes, autocorrelations, partial autocorrelations, autoregressive, moving average, and ARMA processes, spectral density of stationary processes, periodograms and estimation of spectral density.
Students are assumed to understand basics of statistical inference, regression analysis, and scalar and matrix algebra.  Some topics that will be covered include ARIMA models, intervention analysis, regression analysis of time series, cointegration, error correction models, vector autoregression, pooled time series, and time varying parameter models.
 
11.  Data Mining 
a.     Prerequisites: Two statistics courses at the graduate level, or consent of instructor. 
b.     Required Course
Covers topics in data mining, including visualization techniques, elements of machine learning theory, classification and regression trees, Generalized Linear Models, Spline approach, and other related topics.
 
12.  Exploratory Data Analysis  
a.     Prerequisites: Two statistics courses at the graduate level, or consent of instructor. 
b.     Optional Course
Numerical and graphical techniques for summarizing and displaying data. Exploration versus confirmation. Connections with conventional statistical analysis and data mining. Applications to large data sets.
 
13. Introduction to Machine Learning
а.     Prerequisites: Two statistics courses at the graduate level, or consent of instructor.
b.     Optional Course

This course will take a modern, data-analytic approach to the multiple regression model. Our coverage of the material will emphasize the ways that graphical tools can augment traditional methods for describing how the conditional distribution of a dependent variable changes along with the values of one or more independent variables. The course will examine the basic nature and assumptions of the linear regression model, diagnostic tools for detecting violations of the regression as-sumptions, and strategies for dealing with situations in which the basic assumptions are violated..
 
14.  Methods of Statistical Consulting  
a.     Prerequisites: Consent of instructor.
b.     Required Course
Development of effective consulting skills, including the conduct of consulting sessions, collaborative problem-solving, using professional resources, and preparing verbal and written reports. Real-life clients could be obtained from companies in Moscow; to them, service will be provided for free.
 
15.  Introduction to Network Analysis
a.     Prerequisites: none
b.     Required Course
An introduction to various concepts, methods, and applications of social network analysis drawn from the social and behavioral sciences. The primary focus of these methods is the analysis of relational data measured on groups of social actors. Topics to be discussed include a basic introduction to network analysis, graphs and matrices, basic network measures and visualization, reciprocity and transitivity, dyadic and triadic analysis, centrality, egocentric networks, two-mode networks (affiliations, bibliographic/scientometric analysis), cohesive subgroups, equivalences and blockmodeling, hubs & authorities, cores & peripheries, clustering and graph partitioning, large scale structure of networks, statistical modeling in network (ergm/p*/RSiena) and network dynamics, and change in networks.
 
16.  Advanced Topics in Network Analysis with Pajek
a.     Prerequisites: introduction to network analysis or consent of the instructor
b.     Optional Course
The conventional categorization of data analytic methods into descriptive and inferential statistics can be fruitfully applied to network analysis. Descriptive methods of network analysis are important for illuminating structural features of a given network, but they cannot be used to build and/or test theories about the generation of networks. Inferential methods of network analysis can be used to test hypotheses about the generation and evolution of a network, derive measures of uncertainty for network indices, and find probabilistic models that accurately describe the overall features of a network.
 
17.  Network Analysis: Statistical Approaches and modeling
a.     Prerequisites: introduction to network analysis or consent of the instructor
b.     Optional Course
Advanced statistical methods for analyzing social network data, focusing on testing hypotheses about network structure (e.g. reciprocity, transitivity, and closure), the formation of ties based on attributes (e.g. homophily), and network effects on individual attributes (social influence or contagion models). Statistical models (blockmodeling, diffusion, etc.)
 
18. Network Analysis: Application in R
a.    Prerequisites: none
b.   Optional Course
The focus of the course will be how to develop questions about social networks and appropriately test them using the R statistical programming language. Because it is critically important for researchers to be able to analyze the data, and standardized packages hardly ever offer the required set of analytic methods, we are faced with having to write our own code for analysis of specific datasets. Minimal programming skills are desirable, though not required.

19. Introduction to Statistics
a.    Prerequisites: introduction to network analysis or consent of the instructor
b.    Required Course
This course is an introductory course in network analysis, designed to familiarize graduate students with the general concepts and basic techniques of network analysis in sociological re-search, gain general knowledge of major theoretical concepts and methodological techniques used in social network analysis, and get some hands-on experience of collecting, analyzing, and mapping network data with SNA software. In addition, this course will provide ample opportunities to include network concepts in students’ master theses work.

20. Programming in R and Python
a.    Prerequisites: introduction to network analysis or consent of the instructor
b.    Required Course
Students who have never programmed are afraid that it is difficult. This course is designed to introduce them to the basics of programming languages such as R and Python. This course will discuss the difference between these languages, the strengths of each of them. Students will learn the basics of programming and working with these languages.

21. Multidimensional Data Analysis
a.    Prerequisites: introduction to programming with R, introduction to statistics or consent of the instructor
b.   Optional Course
The focus of the course will focus on multivariate methods, in which the variables are studied simultaneously. This course covers both the underlying theory required to understand the multivariate methods, as well as their applications in data analysis. Some of the methods/models covered in the course are principal component analysis, factor analysis, discriminant analysis, canonical correlations, cluster analysis. The course includes computer labs where multivariate data analysis is performed using statistical software.
 

Options for Students with Disabilities

This degree programme of HSE University is adapted for students with special educational needs (SEN) and disabilities. Special assistive technology and teaching aids are used for collective and individual learning of students with SEN and disabilities. The specific adaptive features of the programme are listed in each subject's full syllabus and are available to students through the online Learning Management System.

Programme Documentation

All documents of the degree programme are stored electronically on this website. Curricula, calendar plans, and syllabi are developed and approved electronically in corporate information systems. Their current versions are automatically published on the website of the degree programme. Up-to-date teaching and learning guides, assessment tools, and other relevant documents are stored on the website of the degree programme in accordance with the local regulatory acts of HSE University.

I hereby confirm that the degree programme documents posted on this website are fully up-to-date.

Vice Rector Sergey Yu. Roshchin

Summary of Degree Programme 'Data Analytics and Social Statistics'

Go to Programme Contents and Structure