Бакалавриат
2021/2022
Анализ данных в Python
Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»
Лучший по критерию «Новизна полученных знаний»
Статус:
Курс по выбору (Маркетинг и рыночная аналитика)
Направление:
38.03.02. Менеджмент
Кто читает:
Департамент маркетинга
Где читается:
Высшая школа бизнеса
Когда читается:
3-й курс, 4 модуль
Формат изучения:
с онлайн-курсом
Онлайн-часы:
10
Охват аудитории:
для всех кампусов НИУ ВШЭ
Преподаватели:
Рожков Александр Геннадьевич
Язык:
английский
Кредиты:
3
Контактные часы:
20
Course Syllabus
Abstract
The course is focused on Data Analysis tools and methods in Python. It includes a set of applied tasks to be solved with Python toolkit using methods and algorithms of data preprocessing, data visualization, descriptive and inferential statistics, regression, factor and cluster analysis. The course is primarily focused on tools application in the Python coding environment. Additional reading and exercises are provided to familiarize students with current trends in data analysis for business (marketing, product managements). This course is supplemented with the online module using DataCamp platform courses on NLP and Sentiment analysis. DataCamp’s learn-by-doing methodology combines short expert videos and hands-on-the-keyboard exercises to help learners retain knowledge. Particular focus is placed on data output interpretation and analysis. Upon completing the course students will be able to import data to Python, clean and process it, select and implement analytical methods relevant for the business task identified.
Learning Objectives
- Students know and understand key data analysis principles, have command of data processing and analysis tools in Python, are able to import, process and analyze data, deliver structured analytic report in context of business goals of a company.
- Students implement relevant tools (Python) of data collection, processing and analysis required for particular managerial tasks
Expected Learning Outcomes
- Understand and implement data import and preparation procedure
- Select and implement Python visualization tools to for data analysis and reporting
- Identify, select and implement Python frameworks to complete tasks of descriptive and inferential statistics in data analysis process
- Understand and implement factor analysis, cluster analysis and regression analysis for a business goal
- Understand and implement NLP frameworks for business tasks including sentiment analysis.
- Students are capable to analyze data output, visualize and interpret key insights in data analysis for business tasks.
- Understand and implement factor analysis, cluster analysis and regression analysis in business tasks
- Students know ethical issues of data analysis and can identify the ethical issues in business setting
Course Contents
- Introduction to Data Analysis in Python
- Data preprocessing in Python
- Analytic tools and algorithms in Python
- Data visualization
- Natural Language Processing in Python. DataCamp online module
- Ethical issues of data analysis
Assessment Elements
- Seminar tests 1-4Every seminar ( 8 times total) we have quick tests based on topics discussed previously and reading assignments.
- Python setup and surveyPython environment setup and course survey
- Online classStudents are required to complete 2 courses on DataCamp online platform, invites and course links is sent to students emails @edu.hse.ru
- ProjectStudents will be given dataset to apply data analysis skills, including data preprocessing, exploratory analysis, model specification and reporting of the results.
- ExamThe exam includes several sections: Coding / analytic assignment: You are required to import process, analyze and interpret the results for the data provided. This part is 70% of the exam points. Test: Multiple choice questions and open questions are based on the course materials and additional reading assigned. This part of the exam is 30%.
- Seminar tests 5-8
Interim Assessment
- 2021/2022 4th module0.2 * Seminar tests 5-8 + 0.2 * Seminar tests 1-4 + 0.3 * Exam + 0.1 * Online class + 0.15 * Project + 0.05 * Python setup and survey
Bibliography
Recommended Core Bibliography
- Ivan Idris - Python Data Analysis - Packt Publishing, Limited , 2014-430 - Текст электронный - https://ebookcentral.proquest.com/lib/hselibrary-ebooks/detail.action?docID=1826990
- Bengfort, B., Bilbro, R., & Ojeda, T. (2018). Applied Text Analysis with Python : Enabling Language-Aware Data Products with Machine Learning. Beijing: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1827695
- Beysolow, T. (2018). Applied Natural Language Processing with Python : Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing. [Berkeley, CA]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1892182
- Dr. Ossama Embarak. (2018). Data Analysis and Visualization Using Python : Analyze Data to Create Visualizations for BI Systems. Apress.
- Idris, I. (2016). Python Data Analysis Cookbook. Birmingham, UK: Packt Publishing. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1290098
- McKinney, W. (2018). Python for Data Analysis : Data Wrangling with Pandas, NumPy, and IPython (Vol. Second edition). Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1605925
- Vanderplas, J. T. (2016). Python Data Science Handbook : Essential Tools for Working with Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1425081
Recommended Additional Bibliography
- Ben Stephenson. (2019). The Python Workbook : A Brief Introduction with Exercises and Solutions (Vol. 2nd ed. 2019). Springer.
- Keith McNulty. (2021). Handbook of Regression Modeling in People Analytics : With Examples in R and Python. Chapman and Hall/CRC.
- Shmueli, G., Bruce, P. C., Gedeck, P., & Patel, N. R. (2020). Data Mining for Business Analytics : Concepts, Techniques and Applications in Python. Newark: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=2273611