Магистратура
2020/2021
Нейробайесовские модели
Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»
Лучший по критерию «Новизна полученных знаний»
Статус:
Курс по выбору (Математика машинного обучения)
Направление:
01.04.02. Прикладная математика и информатика
Где читается:
Факультет компьютерных наук
Когда читается:
2-й курс, 3 модуль
Формат изучения:
без онлайн-курса
Преподаватели:
Ветров Дмитрий Петрович
Прогр. обучения:
Статистическая теория обучения
Язык:
английский
Кредиты:
6
Контактные часы:
32
Course Syllabus
Abstract
This course is devoted to Bayesian reasoning in application to deep learning models. Attendees would learn how to use probabilistic modeling to construct neural generative and discriminative models, how to use the paradigm of generative adversarial networks to perform approximate Bayesian inference and how to model the uncertainty about the weights of neural networks. Selected open problems in the field of deep learning would also be discussed. The practical assignments will cover implementation of several modern Bayesian deep learning models.
Learning Objectives
- The learning objective of the course is to give students basic and advanced tools for inference and learning in complex probabilistic models involving deep neural networks, such as probabilistic deep generative models and Bayesian neural networks.
Expected Learning Outcomes
- Knowledge about different approximate inference and learning techniques for probabilistic models
- Hands-on experience with modern probabilistic modifications of deep learning models
- Knowledge about the necessary building blocks that allow to construct new probabilistic models, suitable for the desired problems
Course Contents
- Stochastic Variational Inference (SVI) and Doubly SVI (DSVI)SVI as a scalable alternative to the variational inference for tasks with large data. Application of SVI to latent Dirichlet allocation model.
- Bayesian neural networks and bayesian compression of neural networksVariational inference of the posterior distribution over the weights of discriminative neural networks. Local reparameterization trick for gradient variance reduction. Variational Dropout sparsifies deep neural networks: different parametrization yields totally different model. Soft Weight Sharing: how to save memory, using weights quantization of neural network
- Variational autoencoders (VAE) and normalizing flows (NF)Probabilistic PCA, VAE as a non-linear generalization of probabilistic PCA. Reparametrization trick for doubly-stochastic variational inference. Extending variational approximations with normalizing flows. Examples of normalizing flows
- Discrete Latent Variables and Variance ReductionThe idea of Stochastic Computation Graphs, discrete and continuous stochastic nodes, and gradient estimation: Gumbel-Softmax and REINFORCE with control variates.
- Implicit Variational Inference using Adversarial TrainingAdversarial Variational Bayes for training VAE with implicit inference distribution. f-GANs as a generalization of vanilla GANs for optimizing arbitrary f-divergence.
- Inference in implicit probabilistic modelsImplicit and semi-implicit distributions are flexible parametric families that can be constructed with neural networks in a general way. Such distributions can be used as building blocks for probabilistic models. How to construct such distributions and how to perform inference with such models.
- Deep MCMCHow neural networks help MCMC methods to sample from analytical distribution, and how MCMC methods help neural networks to sample from empirical distribution.
Assessment Elements
- Practical assignmentsPractical assignments consist of programming some models/methods from the course in Python and analysing their behavior: Sparse Variational Dropout (SVDO), NF, VAE, Discrete Latent Variables (DLV).
- Exam2-ой курс. Экзамен состоялся в 3-ем модуле
Bibliography
Recommended Core Bibliography
- Christopher M. Bishop. (n.d.). Australian National University Pattern Recognition and Machine Learning. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.EBA0C705
- Murphy, K. P. (2012). Machine Learning : A Probabilistic Perspective. Cambridge, Mass: The MIT Press. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=480968
- Гудфеллоу Я., Бенджио И., Курвилль А. - Глубокое обучение - Издательство "ДМК Пресс" - 2018 - 652с. - ISBN: 978-5-97060-618-6 - Текст электронный // ЭБС ЛАНЬ - URL: https://e.lanbook.com/book/107901
Recommended Additional Bibliography
- Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight Uncertainty in Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1505.05424
- Grathwohl, W., Choi, D., Wu, Y., Roeder, G., & Duvenaud, D. (2017). Backpropagation through the Void: Optimizing control variates for black-box gradient estimation. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1711.00123
- Jang, E., Gu, S., & Poole, B. (2016). Categorical Reparameterization with Gumbel-Softmax. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1611.01144
- Kingma, D. P., & Welling, M. (2013). Auto-Encoding Variational Bayes. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1312.6114
- Kingma, D. P., Salimans, T., & Welling, M. (2015). Variational Dropout and the Local Reparameterization Trick. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1506.02557
- Levy, D., Hoffman, M. D., & Sohl-Dickstein, J. (2017). Generalizing Hamiltonian Monte Carlo with Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1711.09268
- Louizos, C., & Welling, M. (2017). Multiplicative Normalizing Flows for Variational Bayesian Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1703.01961
- Maddison, C. J., Mnih, A., & Teh, Y. W. (2016). The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1611.00712
- Matt Hoffman, David M. Blei, Chong Wang, & John Paisley. (2013). Stochastic Variational Inference. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.C4CCD6D4
- Mescheder, L., Nowozin, S., & Geiger, A. (2017). Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1701.04722
- Molchanov, D., Ashukha, A., & Vetrov, D. (2017). Variational Dropout Sparsifies Deep Neural Networks. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1701.05369
- Nowozin, S., Cseke, B., & Tomioka, R. (2016). f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1606.00709
- Rezende, D. J., & Mohamed, S. (2015). Variational Inference with Normalizing Flows. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1505.05770
- Sida I. Wang, & Christopher D. Manning. (2013). Fast dropout training. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.C2036E9B
- Song, J., Zhao, S., & Ermon, S. (2017). A-NICE-MC: Adversarial Training for MCMC. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1706.07561
- Tucker, G., Mnih, A., Maddison, C. J., Lawson, D., & Sohl-Dickstein, J. (2017). REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1703.07370
- Ullrich, K., Meeds, E., & Welling, M. (2017). Soft Weight-Sharing for Neural Network Compression. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1702.04008