Введение в большие данные

Магистратура 2021/2022

Статус: Курс по выбору (Международный менеджмент / Master in International Management)

Направление: 38.04.02. Менеджмент

Кто читает: Департамент бизнес-информатики

Где читается: Высшая школа бизнеса

Когда читается: 1-й курс, 4 модуль

Формат изучения: с онлайн-курсом

Онлайн-часы: 2

Охват аудитории: для своего кампуса

Преподаватели: Смелов Леонид Сергеевич

Прогр. обучения: Международный менеджмент

Язык: английский

Кредиты: 3

Контактные часы: 2

Full Syllabus Ask Question

Abstract

Program International Management Link https://www.coursera.org/learn/big-data-introduction?specialization=big-data https://www.coursera.org/learn/big-data-management?specialization=big-data Semester 2 Level Graduate Year 1 Study mode MOOC Type of course Elective ECTS 3 Prerequisites This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Learning outcomes • to be able to describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. • to be able to explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. • to be able to get value out of Big Data by using a 5-step process to structure your analysis. • to be able to identify the frequent data operations required for various types of data • to be able to select a data model to suit the characteristics of your data • to be able to apply techniques to handle streaming data • to be able to differentiate between a traditional Database Management System and a Big Data Management System Contents This course provides an introduction to one of the common frameworks, Hadoop. It also provides the guided hands-on tutorials to introduce the students with the systems and tools like: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. This course covers the following topics: • Big Data Introduction • Big data modeling and management Systems

Learning Objectives

• to be able to describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. • to be able to explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. • to be able to get value out of Big Data by using a 5-step process to structure your analysis. • to be able to identify the frequent data operations required for various types of data • to be able to select a data model to suit the characteristics of your data • to be able to apply techniques to handle streaming data • to be able to differentiate between a traditional Database Management System and a Big Data Management System

Expected Learning Outcomes

* Get value out of Big Data by using a 5-step process to structure your analysis.
* Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.
* Identify what are and what are not big data problems and be able to recast big data problems as data science questions. * Provide an explanation of the architectural components and programming models used for scalable big data analysis.
* Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model. * Install and run a program using Hadoop!
At the end of this course, you will be able to: * Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.

Course Contents

Big Data: Why and Where
Characteristics of Big Data and Dimensions of Scalability
Data Science: Getting Value out of Big Data
Foundations for Big Data Systems and Programming
Systems: Getting Started with Hadoop

Assessment Elements

exam upon finishing this online course
online tests on Coursera platform

Interim Assessment

2021/2022 4th module
0.7 * exam upon finishing this online course + 0.3 * online tests on Coursera platform

Bibliography

Recommended Core Bibliography

Grable, J. E., & Lyons, A. C. (2018). An Introduction to Big Data. Journal of Financial Service Professionals, 72(5), 17–20. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=bsu&AN=131378067
Sitalakshmi Venkatraman, & Ramanathan Venkatraman. (2019). Big data security challenges and strategies. https://doi.org/10.3934/math.2019.3.860
Valentine, C. (2014). Hadoop : 94 Most Asked Questions —— What You Need to Know. Emereo Publishing.
Wenbing Zhao, Longxiang Gao, & Anfeng Liu. (2018). Programming Foundations for Scientific Big Data Analytics. https://doi.org/10.1155/2018/2707604
White, T. (2011). Hadoop : The Definitive Guide: Vol. 2nd ed., updated. Yahoo Press.

Recommended Additional Bibliography

Brajesh Mishra. (2020). Big Data Analysis Using Hadoop Map Reduce. https://doi.org/10.26562/irjcs.2020.v0705.005
Laurent Thiry, Heng Zhao, & Michel Hassenforder. (2018). Categories for (Big) Data models and optimization. https://doi.org/10.1186/s40537-018-0132-9
UI AHSAAN, S., & MOURYA, A. K. (2019). Big Data Analytics: Challenges and Technologies. Annals of the Faculty of Engineering Hunedoara - International Journal of Engineering, 17(4), 75–79.
Wu, C. (2019). CS 644-101: Introduction to Big Data.
Zhenlong Li, Wenwu Tang, Qunying Huang, Eric Shook, & Qingfeng Guan. (2020). Introduction to Big Data Computing for Geospatial Applications. ISPRS International Journal of Geo-Information, 9(487), 487. https://doi.org/10.3390/ijgi9080487

Course Syllabus