Bachelor
2020/2021
Introduction to Data Analysis
Type:
Compulsory course (HSE University and University of London Parallel Degree Programme in International Relations)
Area of studies:
International Relations
Delivered by:
Faculty of World Economy and International Affairs
When:
1 year, 2 module
Mode of studies:
distance learning
Open to:
students of one campus
Language:
English
ECTS credits:
3
Contact hours:
28
Course Syllabus
Abstract
This course offers an introduction to the modern data science methods that are useful for both research and industrial careers. The main focus of the course is to teach students to find data on the Internet, to process it and to perform a simple data analysis. Students are trained to develop critical thinking and to apply the scientific approach to problem solving. The course starts from the basics of working with data. Students will be taught to perform a basic data analysis in Google Sheets. Students will learn how to sort and filter data, to calculate various distribution characteristics and to create graphs and charts in accordance with the standards of their design. A part of the course also concerns the main methods of data storage and its usage. Students will learn the main methods that lead to scientific results of the analysis in humanities such as time series and linear regression analyses. Students will learn to apply all these techniques in Google Sheets.
Learning Objectives
- To provide an introduction to modern data science techniques
- To introduce the main concepts of scientific data analysis
- To show the best practices of working with data
- To train basic skills in Google Sheets
Expected Learning Outcomes
- Demonstrate knowledge of basic concepts of data science
- Perform exploratory data analysis in Google Sheets
- Formulate and solve simple scientific problems
- To understand the notions of continuous random variable and of probability distribution. Know how to apply the central limit theorem
- Know the methods of interval estimation and T-statistics. Be able to work with different kinds of data
- Understand the notions of correlation and simple linear regression.
Course Contents
- Introduction to Data Analysis(1) Applied data science in the international relations. The examples of applications, the examples of application misuse and mistakes. (3) Practice: Data upload. Filtering. Missing data handling. Line plot. Minimum. Maximum. (4) Theory: Continuous random variables and probability distributions. notion of continuous random variable. Notion of probability distribution.
- Normal distribution(1) Practice: Multiple indicators. Variance, st.deviation, quartiles. Histograms. (2) Central Limit Theorem
- The simpliest text analysis(1) Practice: IF(), IFS(), COUNTIF(). Categorization. Pie chart. (2) Normal distribution (continued)
- Sampling and confidence intervals(1) Practice: Box-and-whisker plot. Jointing tables. VLookUp (2) Theory: Sampling and confidence interval estimation
- Simple linear regression(1) and (2), Theory and Practice: Correlation. Conditional formatting. Simple linear regression. Trendline. Scatterplot. R^2
- Confidence intervalsAverage of the averages. Combo chart. Normal distribution histogram. Confidence intervals (continued)
- Linear regression analysisObtaining predictions using a linear regression model. The concept of splitting data into train and test. Model quality evaluation.
- Preparation for the exam
- Interval estimationMethods of interval estimation. T-statistics. Work with different kinds of data
- Hypothesis testingNull hypothesis. Hypothesis testing. Real World Example of Hypothesis Testing
Assessment Elements
- Homework 1 (Data Culture)
- Homework 2 (Data Culture)
- Exam (Data analysis and Data Culture)
- Homework 3 (Data Culture)
Interim Assessment
- Interim assessment (2 module)0.4 * Exam (Data analysis and Data Culture) + 0.2 * Homework 1 (Data Culture) + 0.2 * Homework 2 (Data Culture) + 0.2 * Homework 3 (Data Culture)
Bibliography
Recommended Core Bibliography
- Introductory statistics for business and economics, Wonnacott, T. H., 1990
Recommended Additional Bibliography
- Bąska, M., Pondel, M., & Dudycz, H. (2019). Identification of advanced data analysis in marketing: A systematic literature review. Journal of Economics & Management, 35(1), 18–39. https://doi.org/10.22367/jem.2019.35.02
- Springston, M., Ernst, J. V., Clark, A. C., Kelly, D. P., & DeLuca, V. W. (2019). data analysis. Technology & Engineering Teacher, 79(4), 26–29. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=asn&AN=139712968