2024/2025
Data Analysis in the Social Sciences
Type:
Mago-Lego
Delivered by:
School of Sociology
When:
2, 3 module
Open to:
students of one campus
Instructors:
Stanislav Pashkov
Language:
English
ECTS credits:
6
Course Syllabus
Abstract
The goal of this course is to introduce students the main principles of data analysis methods and procedures commonly used in social science, with an accent on social problems and corresponding scientific challenges. Students are to familiarize with a variety of data analysis methods, which should be useful in quantitative research. It is aimed at developing a data-driven, as well as theory-driven logic through understanding data fundamentals, main application areas. Students will be able to understand limitations, values added and heuristic mechanisms of different data analysis methods. During the course, students will acquire practical skills to be able to gather, generate, visualize and analyze quantitative data in social science research. All the learning process is based on R language and RStudio.
Learning Objectives
- The main goal of the course is to familiarize students with a variety of data analysis methods which should be useful in quantitative research. The course is aimed at developing a datadriven mentality through understanding data fundamentals, as well as areas of application for different analytical methods and approaches. Students should be able to understand the limitations, value added and heuristic mechanisms of different data analysis methods.
Expected Learning Outcomes
- A student orients he/she-self in variety of conceptual approaches to the phenomena of social memory, its connection with the problematics of values and identity, its dependence on the social and political changes
- Can do statistical analysis in R
- Be able to write a statistical analysis report
- A student knows the main approaches in the field and can use the main methods in political science
- Students are able to use basic methods of computational text mining.
- Able to solve professional problems based on synthesis and analysis
- Advance the skills of working with social and economic databases applicable to the student’s own research
- Able to demonstrate knowledge on the principles for data visualization
- Be familiar with contemporary text mining models and approaches
- Get knowledge about how to use computer modeling in social and political sciences research and to apply agent-based modeling for data analysis.
- Practice skills inprogramming with R for a research proposal.
Course Contents
- SESSION 1: Introduction to Research Design and Research Methodology of Quantitative Study
- SESSION 2: Macro- and Micro-Data. Open-source Datasets. Questionnaire design
- SESSION 3: Dataset Description and Data visualization
- SESSION 4: Statistical Analysis and principles of Descriptive Statistics
- SESSION 5: Exploring Distributions and Statistical Inference
- SESSION 7: Statistical Relationship. Causation
- SESSION 6: Statistical Relationship. Correlation
- SESSION 8: Linear Regression Models
- SESSION 9: Nonlinear Regression Models
- SESSION 10: Factor and Cluster Analysis
- SESSION 11: Qualitative-Quantitative analysis of Textual Data using R (Appendix)
Assessment Elements
- Laboratory assessment No. 3: Regression modelingThe laboratory work covers the material of sessions 7-9 and is a variant of a practical task in which the student needs to give a detailed answer or demonstrate writing code in R and interpreting (visualizing) the data received. As part of the laboratory work, the student is given a country and a set of variables that the student needs to process, build the appropriate statistical measurements, and interpret the process and results. The work is done individually.
- Final project
- Laboratory assessment No. 1: Descriptive statisticsThe laboratory work covers the material of sessions 3-5 and is a variant of a practical task in which the student needs to give a detailed answer or demonstrate writing code in R and interpreting (visualizing) the data received. As part of the laboratory work, the student is given a country and a set of variables that the student needs to process, build the appropriate statistical measurements, and interpret the process and results. The work is done individually.
- Class attendance
- Laboratory assessment No. 2: Factor/Cluster AnalysisThe laboratory work covers the material of sessions 6 & 10 and is a variant of a practical task in which the student needs to give a detailed answer or demonstrate writing code in R and interpreting (visualizing) the data received. As part of the laboratory work, the student is given a country and a set of variables that the student needs to process, build the appropriate statistical measurements, and interpret the process and results. The work is done individually.
Interim Assessment
- 2024/2025 3rd module0.1 * Class attendance + 0.3 * Final project + 0.15 * Laboratory assessment No. 1: Descriptive statistics + 0.2 * Laboratory assessment No. 2: Factor/Cluster Analysis + 0.25 * Laboratory assessment No. 3: Regression modeling
Bibliography
Recommended Core Bibliography
- C.R. Rao. (2005). Data Mining and Data Visualization: Vol. 1st ed. North Holland.
- Churcher, C. (2012). Beginning Database Design : From Novice to Professional (Vol. 2nd ed). New York: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1173722
- Creswell, J. W. . (DE-588)133331512, (DE-576)164944168. (1994). Research design : qualitative and quantitative approaches / John W. Creswell. Thousand Oaks, Calif. [u.a.]: Sage. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edswao&AN=edswao.040749258
Recommended Additional Bibliography
- King, G., Keohane, R. O., & Verba, S. (1995). The Importance of Research Design in Political Science. American Political Science Review, (02), 475. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsrep&AN=edsrep.a.cup.apsrev.v89y1995i02p475.481.09
- Munzert, S. (2014). Automated Data Collection with R : A Practical Guide to Web Scraping and Text Mining. HobokenChichester, West Sussex, United Kingdom: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=878670
- Rajaraman, A., & Ullman, J. D. (2012). Mining of Massive Datasets. New York, N.Y.: Cambridge University Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=408850