Machine Learning in Bioinformatics

Master 2024/2025

Category 'Best Course for Career Development'

Category 'Best Course for Broadening Horizons and Diversity of Knowledge and Skills'

Type: Elective course (Data Analysis in Biology and Medicine)

Area of studies: Applied Mathematics and Informatics

Delivered by: Big Data and Information Retrieval School

Where: Faculty of Computer Science

When: 1 year, 1, 2 module

Mode of studies: distance learning

Online hours: 14

Open to: students of one campus

Instructors: Kirill Alekseev, Maria Poptsova

Master’s programme: Data Analysis for Biology and Medicine

Language: English

ECTS credits: 6

Full Syllabus Ask Question

Abstract

The course introduces the theory and practice of machine learning algorithms and their applications in the area of bioinformatics. The students will learn data preprocessing techniques, methods of dimension reduction, technique of modeling using machine-learning algorithms, parameter tuning. The studied algorithms include linear regression with regularization (ridge regression, elastic net, lasso), multivariate adaptive regression splines, support vector machines, neural networks, k-nearest neighbors, classification and regression trees, random forest, gradient boosting. Workshops, which follow the lectures, seek to empower students with the practical skills in predictive modeling software tools, packages and applications. Many case studies of predictive models for bioinformatics data sets will be considered.

Learning Objectives

To know the theory of the process and components of predictive modeling, types of predictive models, key steps of model creation, such as data-preprocessing, model construction and assessment of model performance
To know various practical applications of predictive modeling using machine-learning algorithms for the databases of molecular biology
To acquire the skills to use python functions from different python packages to apply different types of models such as linear and nonlinear regression models, linear and nonlinear classification models, regression trees and rule-based models
To acquire the skills to use python functions from different python packages to pre-process the input data, i.e. calculate statistics, estimate skewness, apply appropriate transformation, perform PCA, find between-predictor correlations, generate dummy variables
To acquire the skills to use python functions to measure predictor importance and model performance, use filtering methods, measure outcome error
To apply the knowledge and tools of predictive analytics to bioinformatics applications

Expected Learning Outcomes

apply the knowledge and tools of predictive analytics to real-life applications
acquire the skills to implement machine-learning algorithms in python
know the theory of machine-learning algorithms

Course Contents

Big Data in Bioinformatics. Concepts of model building.
Data Preprocessing.
Linear regression models.
Multivariate adaptive regression splines.
Neural networks.
Support vector machines. K-nearest neighbors.
Measuring performance in classification models.
Linear classification models
Nonlinear classification models
Decision Trees
Machine-learning in bioinformatics

Assessment Elements

Home assignment 1
Home assignment 2
Home assignment 3
Home assignment 4
Exam

Interim Assessment

2024/2025 2nd module
0.4 * Exam + 0.15 * Home assignment 1 + 0.15 * Home assignment 2 + 0.15 * Home assignment 3 + 0.15 * Home assignment 4

Bibliography

Recommended Core Bibliography

Machine learning : a probabilistic perspective, Murphy, K. P., 2012

Recommended Additional Bibliography

Data mining : practical machine learning tools and techniques, Witten, I. H., 2011
Machine learning : the art and science of algorithms that make sense of data, Flach, P., 2014
Witten, I. H. et al. Data Mining: Practical machine learning tools and techniques. – Morgan Kaufmann, 2017. – 654 pp.
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2017). Data Mining : Practical Machine Learning Tools and Techniques (Vol. Fourth edition). Cambridge, MA: Morgan Kaufmann. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1214611

Authors

YAKOVLEVA ILONA ALEKSANDROVNA

Course Syllabus

Course Syllabus

Course Syllabus

Abstract

Learning Objectives

Expected Learning Outcomes

Course Contents

Assessment Elements

Interim Assessment

Bibliography

Recommended Core Bibliography

Recommended Additional Bibliography

Authors