• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Bachelor 2023/2024

Introduction to Python for Data Science

Area of studies: Economics
When: 2 year, 1, 2 module
Mode of studies: distance learning
Online hours: 20
Open to: students of all HSE University campuses
Language: English
ECTS credits: 4
Contact hours: 56

Course Syllabus

Abstract

Python is an interpreted high-level general-purpose programming language. It has a set of powerful libraries for data analysis. It is a simple language for beginners to learn, though it is powerful enough for writing large applications. This 2-module course is an introduction to the Python programming language and data science. The average time to complete this course depends on student background. To complete the course, students are supposed to have mathematical skills at the high school level. Students’ academic performance is evaluated using programming assignments: homework and classwork. Also there is one mid-semester exam and final exam. The examples and problems used in this course cover such areas as text processing, HTML and data analytics. This course does not provide lectures and students must finish corresponding week of recommended online course before seminar class.
Learning Objectives

Learning Objectives

  • Teach students how to create basic scripts, understand data types, statements and logical expressions; create own functions and use libraries.
  • Collect, store, process and analyze data automatically with the use of scripting languages.
  • Can identify the data needed for addressing the financial and business objectives.
Expected Learning Outcomes

Expected Learning Outcomes

  • Student can create scripts for data analysis
  • Student can explain basic principles of Python programming language
  • Student can read and understand simple scripts.
Course Contents

Course Contents

  • Basic of Python programming
  • Boolean data type and IF conditions
  • WHILE loops
  • Lists and FOR loops
  • Dictionaries and Methods
  • Nested data structures. Sorting
  • Functions
  • Text files, tables, JSON
  • Scraping: collection of links from website
  • Additional chapters: re
  • Additional chapters: pandas
Assessment Elements

Assessment Elements

  • non-blocking Graded Seminar
    Graded Seminar cannot be retaken regardless of the reason for absence. Graded Seminar tasks covers all topics from seminars (in particular focusing on text files manipulations, web scraping, and regular expressions). Graded Seminar is organized during the seminar class in offline mode. Graded Seminar tasks are allowed to be completed in groups of students. The seminar assignment has its own deadline (the late submissions after this provided deadline are prohibited and not graded - such the assignment cannot be retaken). The final grade for the Graded Seminar is calculated on the basis of relative scales (based on all solutions from the course participants) and cannot be more than 10. All team-members gets the same grade.
  • non-blocking Homework
    Homework cannot be retaken regardless of the reason for absence. The maximum grade for the Homework is 10, including tasks that check an outstanding student performance.
  • non-blocking Mid-term
    Mid-term covers all topics from the Syllabus (the first module). The length is 60 minutes. The maximum grade is 10.
  • non-blocking Exam
    Exam is not blocking. Exam covers all topics from the Syllabus. The length is 60 minutes. The final maximum number of points for the Exam is not more than 10. Exam format is paper-based. Exam is open-book (only printed materials are allowed; any use of electronic materials and devices are prohibited during the Exam).
Interim Assessment

Interim Assessment

  • 2023/2024 2nd module
    0.4 * Exam + 0.15 * Graded Seminar + 0.25 * Homework + 0.2 * Mid-term
Bibliography

Bibliography

Recommended Core Bibliography

  • 9781491912140 - Vanderplas, Jacob T. - Python Data Science Handbook : Essential Tools for Working with Data - 2016 - O'Reilly Media - https://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1425081 - nlebk - 1425081

Recommended Additional Bibliography

  • 9781785284571 - Romano, Fabrizio - Learning Python - 2015 - Packt Publishing - http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1133614 - nlebk - 1133614

Authors

  • TERNIKOV Andrei ALEKSANDROVICH