Магистратура
2021/2022
Введение в большие данные
Статус:
Курс по выбору (Международный менеджмент / Master in International Management)
Направление:
38.04.02. Менеджмент
Кто читает:
Департамент бизнес-информатики
Где читается:
Высшая школа бизнеса
Когда читается:
1-й курс, 4 модуль
Формат изучения:
с онлайн-курсом
Онлайн-часы:
2
Охват аудитории:
для своего кампуса
Преподаватели:
Смелов Леонид Сергеевич
Прогр. обучения:
Международный менеджмент
Язык:
английский
Кредиты:
3
Контактные часы:
2
Course Syllabus
Abstract
Program International Management Link https://www.coursera.org/learn/big-data-introduction?specialization=big-data https://www.coursera.org/learn/big-data-management?specialization=big-data Semester 2 Level Graduate Year 1 Study mode MOOC Type of course Elective ECTS 3 Prerequisites This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Learning outcomes • to be able to describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. • to be able to explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. • to be able to get value out of Big Data by using a 5-step process to structure your analysis. • to be able to identify the frequent data operations required for various types of data • to be able to select a data model to suit the characteristics of your data • to be able to apply techniques to handle streaming data • to be able to differentiate between a traditional Database Management System and a Big Data Management System Contents This course provides an introduction to one of the common frameworks, Hadoop. It also provides the guided hands-on tutorials to introduce the students with the systems and tools like: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. This course covers the following topics: • Big Data Introduction • Big data modeling and management Systems
Learning Objectives
- • to be able to describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. • to be able to explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. • to be able to get value out of Big Data by using a 5-step process to structure your analysis. • to be able to identify the frequent data operations required for various types of data • to be able to select a data model to suit the characteristics of your data • to be able to apply techniques to handle streaming data • to be able to differentiate between a traditional Database Management System and a Big Data Management System
Expected Learning Outcomes
- * Get value out of Big Data by using a 5-step process to structure your analysis.
- * Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.
- * Identify what are and what are not big data problems and be able to recast big data problems as data science questions. * Provide an explanation of the architectural components and programming models used for scalable big data analysis.
- * Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model. * Install and run a program using Hadoop!
- At the end of this course, you will be able to: * Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.
Course Contents
- Big Data: Why and Where
- Characteristics of Big Data and Dimensions of Scalability
- Data Science: Getting Value out of Big Data
- Foundations for Big Data Systems and Programming
- Systems: Getting Started with Hadoop
Interim Assessment
- 2021/2022 4th module0.7 * exam upon finishing this online course + 0.3 * online tests on Coursera platform
Bibliography
Recommended Core Bibliography
- Grable, J. E., & Lyons, A. C. (2018). An Introduction to Big Data. Journal of Financial Service Professionals, 72(5), 17–20. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=bsu&AN=131378067
- Sitalakshmi Venkatraman, & Ramanathan Venkatraman. (2019). Big data security challenges and strategies. https://doi.org/10.3934/math.2019.3.860
- Valentine, C. (2014). Hadoop : 94 Most Asked Questions —— What You Need to Know. Emereo Publishing.
- Wenbing Zhao, Longxiang Gao, & Anfeng Liu. (2018). Programming Foundations for Scientific Big Data Analytics. https://doi.org/10.1155/2018/2707604
- White, T. (2011). Hadoop : The Definitive Guide: Vol. 2nd ed., updated. Yahoo Press.
Recommended Additional Bibliography
- Brajesh Mishra. (2020). Big Data Analysis Using Hadoop Map Reduce. https://doi.org/10.26562/irjcs.2020.v0705.005
- Laurent Thiry, Heng Zhao, & Michel Hassenforder. (2018). Categories for (Big) Data models and optimization. https://doi.org/10.1186/s40537-018-0132-9
- UI AHSAAN, S., & MOURYA, A. K. (2019). Big Data Analytics: Challenges and Technologies. Annals of the Faculty of Engineering Hunedoara - International Journal of Engineering, 17(4), 75–79.
- Wu, C. (2019). CS 644-101: Introduction to Big Data.
- Zhenlong Li, Wenwu Tang, Qunying Huang, Eric Shook, & Qingfeng Guan. (2020). Introduction to Big Data Computing for Geospatial Applications. ISPRS International Journal of Geo-Information, 9(487), 487. https://doi.org/10.3390/ijgi9080487