• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Магистерская программа «Аналитика данных и прикладная статистика / Data Analytics and Social Statistics»

Data Mining

2020/2021
Учебный год
ENG
Обучение ведется на английском языке
6
Кредиты
Статус:
Курс обязательный
Когда читается:
1-й курс, 1, 2 модуль

Course Syllabus

Abstract

Covers topics in data mining, including visualization techniques, elements of machine learning theory, classification and regression trees, Generalized Linear Models, Spline approach, and other related topics.
Learning Objectives

Learning Objectives

  • The course gives students an important foundation to develop and conduct their own research as well as to evaluate research of others.
Expected Learning Outcomes

Expected Learning Outcomes

  • Know well-known sequential pattern mining methods, including methods for mining sequential patterns, such as GSP, SPADE, PrefixSpan, and CloSpan
  • Know various pattern mining applications, such as mining spatiotemporal and trajectory patterns and mining quality phrases.
  • Know efficient pattern mining methods, such as Apriori, ECLAT, and FPgrowth.
  • Know constraint-based pattern mining, including methods for pushing different kinds of constraints, such as data and pattern-based constraints, anti-monotone, monotone, succinct, convertible, and multiple constraints.
  • Be able to recall important pattern discovery concepts, methods, and applications, in particular, the basic concepts of pattern discovery, such as frequent pattern, closed pattern, max-pattern, and association rules.
  • Be able to compare pattern evaluation issues, especially several popularly used measures, such as lift, chisquare, cosine, Jaccard, and Kulczynski, and their comparative strengths.
  • Be able to compare mining diverse patterns, including methods for mining multi-level, multi-dimensional patterns, qualitative patterns,
  • Be able to compare negative correlations, compressed and redundancy-aware top-k patterns, and mining long (colossal) patterns.
Course Contents

Course Contents

  • Introduction
    Course Orientation; Course Pattern Discovery Overview; Pattern Discovery Basic Concepts; Efficient; Pattern Mining Methods; Pattern Discovery
  • Pattern evaluation
    The session sets up the framework for pattern evaluation and mining diverse frequent patterns. It also addresses Sequential Pattern Mining; Pattern Mining Applications; Mining Spatiotemporal and Trajectory Patterns.
  • Pattern mining I
    The session gives an overview into pattern-based mining, graph pattern mining, and pattern-based classification.
  • Pattern mining II
    This sessions builds the understanding of Pattern Mining Applications: Mining Quality Phrases-from Text Data; Advanced Topics on Pattern Discovery.
  • Cluster analysis
    Cluster Analysis Overview; Cluster Analysis Introduction; Similarity Measures for Cluster Analysis
  • Clustering Methods I
    This session will continue the topic of clustering with Partitioning-Based Clustering Methods; Hierarchical Clustering Methods.
  • Clustering Methods II
    Hierarchical Clustering Methods (continued); Density-Based and Grid-Based Clustering Methods
  • Clustering Methods III
    This session will conclude clustering with methods for clustering validation.
Assessment Elements

Assessment Elements

  • non-blocking Final take-home project
  • non-blocking Homework Assignments (5 x Varied points)
  • non-blocking In-Class Labs (9-10 x Varied points)
  • non-blocking Quizzes (Best 9 of 10, Varied points)
Interim Assessment

Interim Assessment

  • Interim assessment (2 module)
    0.5 * Final take-home project + 0.2 * Homework Assignments (5 x Varied points) + 0.2 * In-Class Labs (9-10 x Varied points) + 0.1 * Quizzes (Best 9 of 10, Varied points)
Bibliography

Bibliography

Recommended Core Bibliography

  • ElAtia, S., Ipperciel, D., & Zaiane, O. R. (2017). Data Mining and Learning Analytics : Applications in Educational Research. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1351385
  • Han, J., & Kamber, M. (2011). Data Mining: Concepts and Techniques (Vol. 3rd ed). Burlington, MA: Morgan Kaufmann. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=377411
  • Larose, D. T., & Larose, C. D. (2015). Data Mining and Predictive Analytics. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=958471
  • S. K. Mourya, & Shalu Gupta. (2013). Data Mining and Data Warehousing. [N.p.]: Alpha Science Internation Limited. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1688519

Recommended Additional Bibliography

  • Brown, M. S. (2014). Data Mining For Dummies. Hoboken: For Dummies. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=842663
  • Knobbe, A. J. (2006). Multi-relational Data Mining. Amsterdam: IOS Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=176061
  • Motoda, H. (2002). Active Mining : New Directions of Data Mining. Amsterdam: IOS Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=87558