Master
2021/2022
Big Data Systems Research Seminar "Latest Trends in Data Governance, Big Data Analytics & Data Architecture"
Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'
Category 'Best Course for New Knowledge and Skills'
Type:
Compulsory course (Business Analytics and Big Data Systems)
Area of studies:
Business Informatics
Delivered by:
Department of Business Informatics
Where:
Graduate School of Business
When:
1 year, 2, 3 module
Mode of studies:
offline
Open to:
students of one campus
Instructors:
Natalya Khapayeva
Master’s programme:
Business Analytics and Big Data Systems
Language:
English
ECTS credits:
4
Contact hours:
48
Course Syllabus
Abstract
This course's key objective is to make the students familiar with the most important big data concepts and introduce the modern approaches in creating data products. We will examine what is a data product and the technology basis that allows to build it. We will learn about • Big Data Ecosystem • (Big) Data Management • Data Products&Economics • Data Culture&Ethics. Students will gain the ability to initiate and design data products and understand the business and ethics, governance, and sustainability challenges relating to Big Data. Most lectures will be presented using Python and SQL examples. Some lectures will use Java and/or Scala.
Learning Objectives
- This course gives you insights into how big data technologies impact the business.
Expected Learning Outcomes
- Define key concepts and identify technologies in the field of Big Data
- Describe the ethics, governance, and sustainability challenges relating to Big Data
- Design and evaluate an approach for the architecture of infrastructure for Big Data products based upon particular needs, including selecting an appropriate set of technologies, and governance strategy for storage and processing data
- Discuss the impact of digitization and the adoption of Big Data in business and overall society
- Explain the challenges of creating and maintaining Big Data products
Course Contents
- Big Data Ecosystem
- (Big) Data Management
- Data Products and Economics
- Data Culture and Ethics
Assessment Elements
- Multichoice test + written problemsExamination format: The exam is taken written The platform: The exam is taken on the Google Forms and MS Teams platforms. Students are required to join a session 15 minutes before the beginning. A student is supposed to follow the requirements below: Check your computer for compliance with technical requirements no later than 7 days before the exam; Use your corporate account (@edu.hse.ru) to check-in into the test form; Check your microphone, speakers or headphones, webcam, Internet connection (we recommend connecting your computer to the network with a cable, if possible); Prepare the necessary writing equipment, such as pens, pencils, pieces of paper, and others. If one of the necessary requirements for participation in the exam cannot be met, a student is obliged to inform a professor and a manager of a program 2 weeks before the exam date to decide on the student's participation in the exams. Students are not allowed to: Turn off the video camera; Leave the place where the exam task is taken (go beyond the camera's viewing angle); Involve outsiders for help during the exam, talk to outsiders during the examination tasks; Read tasks out loud. Interact with other students. Students are allowed to: Write on a piece of paper, use a pen for making notes and calculations; Use notes and textbooks, e.g. in digital form; Turn on the microphone to answer the teacher’s questions; Ask a teacher for additional information related to understanding the exam task. Connection failures: A short-term communication failure during the exam is considered to be the loss of a student's network connection with the MS Teams and Google Forms platforms for no longer than 1 minute. A long-term communication failure during the exam is considered to be the loss of a student's network connection with the MS Teams and Google Forms platforms for longer than 1 minute. A student cannot continue to participate in the exam, if there is a long-term communication failure appeared. The retake procedure is similar to the exam procedure. In case of long-term communication failure in the MS Teams and Google Forms platforms during the examination task, the student must notify the teacher, record the fact of loss of connection with the platform (screenshot, a response from the Internet provider). Then contact the manager of a program with an explanatory note about the incident to decide on retaking the exam.
- Course project1 course project: create and present a data product
- Homeworks2 homeworks: a research on a given topic/technology and a homework about implementation Spark
Interim Assessment
- 2021/2022 3rd module0.5 * Course project + 0.2 * Multichoice test + written problems + 0.3 * Homeworks
Bibliography
Recommended Core Bibliography
- Malaska, T., & Seidman, J. (2018). Foundations for Architecting Data Solutions : Managing Successful Data Projects: Vol. First edition. O’Reilly Media.
- Thomas Erl, Wajid Khattak, & Paul Buhler. (2016). Big Data Fundamentals : Concepts, Drivers & Techniques. Prentice Hall.
Recommended Additional Bibliography
- Jules S. Damji, Brooke Wenig, Tathagata Das, & Denny Lee. (2020). Learning Spark. O’Reilly Media.
- Kleppmann, M. (2017). Designing Data-Intensive Applications : The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. Sebastopol, CA: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1487643
- Mark Richards, & Neal Ford. (2019). Fundamentals of Software Architecture : An Engineering Approach. O’Reilly Media.