• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
2023/2024

Basics of Data Analysis

Type: Mago-Lego
When: 1, 2 module
Open to: students of one campus
Instructors: Boris Mirkin
Language: English
ECTS credits: 6
Contact hours: 54

Course Syllabus

Abstract

Data analysis is to help the user in enhancing and augmenting knowledge of the domain as represented by the concepts and statements of relation between them. This view distinguishes this class from related subjects such as applied statistics, machine learning, data mining, etc. Two main pathways for knowledge discovery are: (1) summarization, for developing and augmenting concepts, and (2) correlation, for enhancing and establishing relations between concepts. The term summarization is understood quite broadly here to embrace not only simple summaries like totals and means, but also more complex summaries: the principal components of a set of features and cluster structures in a set of entities. Similarly, correlation here covers both bivariate and multivariate relations between input and target features including regression, classification trees and Bayesian classifiers. Another feature of the class is that its main thrust is in giving an in-depth presentation of a few basic techniques and their properties rather than to cover a broad spectrum of approaches developed so far. This allows me to bring forward a number of mathematically derived interpretation tools and relations between methods that are usually overlooked.