Магистратура
2020/2021
Данные и аналитика в финансах
Лучший по критерию «Полезность курса для Вашей будущей карьеры»
Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»
Лучший по критерию «Новизна полученных знаний»
Статус:
Курс обязательный (Финансовые стратегии и аналитика)
Направление:
38.04.08. Финансы и кредит
Кто читает:
Департамент экономики и финансов
Где читается:
Факультет экономики, менеджмента и бизнес-информатики
Когда читается:
1-й курс, 3, 4 модуль
Формат изучения:
без онлайн-курса
Прогр. обучения:
Финансовые стратегии и аналитика
Язык:
английский
Кредиты:
6
Контактные часы:
72
Course Syllabus
Abstract
The course is aimed to provide students with the basic understanding of data analytics and machine learning concepts with regard to finance and practical implementation of these concepts by using programming software in order to provide organizations with data-driven solutions. The course begins with essentials of data collection and wrangling. The aim of this part is to teach students how to find, parse, import, manipulate and visualize financial data. The next part of the course provides students with research and analytical skills and covers such methods as principal component analysis, clustering, different techniques of curve fitting and LASSO regression. The final part of the course shows how machine learning methods can be applied to finance through the example of fraud detection. The course is based on real data from open sources and data on Russian and European public companies collected by International laboratory of intangible-driven economy NRU HSE and data on sales and customer analytics provided by laboratory GAMES NRU HSE. After completing the course students will be able to use data management techniques, to optimise asset portfolio, to provide customer analytics and detect fraud.
Learning Objectives
- Work easily in R, import data in R, make basic manipulation with it to prepare data for calculations and export results of calculations.
- Apply methods of data analysis and understand their objectives.
- Understand limitation and relevance of the methods.
Expected Learning Outcomes
- Apply skills in data cleaning.
- Demonstrate the ability to work in different software environments for data analysis and to explain the choice of software.
- Understand basic theories in analysis of financial data, invent and write a code for a particular task in finance data analysis.
- Master ability of making decision on base of data analysis and proving them.
- Make decision in finance on base of data analysis and prove them.
Course Contents
- Data wrangling with R1. Introduction to R: Data Structures; Subsetting; Functions; Vectorization. 2. Data Wrangling: Tidy Data; Reshape; Summarize. 3. Data Visualization: Base Graphics; Grammar of Graphics; Interactive Graphics.
- Optimization problems on financial data4. Principal component analysis and clustering. Main objectives of principal component analysis (PCA). Mathematical model of components discovery. Algorithms of PCA implementation. Latent variable, criteria for defining number of components. Rotation, interpretation of the results. Main objectives of clustering, geometrical interpretation. Measures of distance between objects and measures of distance between clusters. k-means and k-median clustering: objective, algorithm, results interpretation. Criteria for defining number of clusters and quality of clustering. Method implementation for case-study “Customer analytics in banks”. 5. Curve fitting. Main objective of curve fitting and financial problems, that it can help to solve. Interpolation and extrapolation. Different types of curve fitting: polynomial and spline interpolation (local polynomial fitting). Procedure of estimating curve fitting. Method implementation for case-study “Fitting yield curve”. 6. Portfolio optimization on data. Optimal portfolio of two risky assets: theoretical model. Model solution as a solution of quadratic programming problem. Sensitivity to model inputs. Optimal portfolio problem for p-dimensions. and LASSO technique to deal it. Method implementation for case-study “Construction a portfolio on trading data of a stock”.
- Fraud detection using machine learning7. Introduction to fraud detection and Data preprocessing. Importance of fraud detection. Definition and types of fraud. Types of variables. Data exploration and visualization. Dealing with missing values. Standardizing and transforming data. 8. Featurization, Social Network Analysis and Dealing with imbalanced datasets. Traditional features for fraud detection. Social Network Analysis. Random oversampling (ROS) and random undersampling (RUS). Synthetic Minority Over-sampling Techniques (SMOTE). 9. Supervised and unsupervised techniques for fraud detection. Linear and logistic regression. Decision trees and ensemble methods. Evaluating fraud detection models. Digit analysis using Benford’s Law. Multivariate outlier detection using robust statistics.
Interim Assessment
- Interim assessment (4 module)0.4 * Exam + 0.15 * Self-study students’ work + 0.15 * Seminar activities + 0.15 * Test 1 + 0.15 * Test 2
Bibliography
Recommended Core Bibliography
- Provost, F., & Fawcett, T. (2013). Data Science for Business : What You Need to Know About Data Mining and Data-Analytic Thinking (Vol. 1st ed). Beijing: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=619895
Recommended Additional Bibliography
- Tsay, R. S. (2013). An Introduction to Analysis of Financial Data with R. Wiley.