Бакалавриат
2023/2024
Обработка естественного языка
Статус:
Курс обязательный
Направление:
09.03.04. Программная инженерия
Когда читается:
4-й курс, 2, 3 модуль
Формат изучения:
без онлайн-курса
Охват аудитории:
для своего кампуса
Преподаватели:
Бурашников Евгений Павлович
Язык:
английский
Кредиты:
6
Контактные часы:
72
Course Syllabus
Abstract
The course is aimed at mastering the basics of natural language processing (NLP), a dynamic interdisciplinary field. The course covers the methods and approaches used in many real NLP applications such as language modeling, text classification, sentiment analysis, generalization, and machine translation. Students taking the course will not only use some of the existing NLP libraries and software packages, but will also learn about the principles behind their design and about the mathematical models that underlie modern computational linguistics. The course also involves performing practical tasks in Python programming and experimenting with texts written in English and Russian. Prerequisites are programming skills in python, general knowledge of linguistics
Learning Objectives
- Formation of students' theoretical knowledge and practical skills on the basics of machine processing of natural language.
Expected Learning Outcomes
- Apply basic approaches to word embeddings, such as Count-based methods, Word2Vec, Glove
- Apply classic machine learning methods such as Naive Bayes, SVM, LR and deep learning approaches such as FCN, CNN, LSTM for text classification problem
- Applying open-source libraries for text preprocessing, such as Natasha and nltk. Resume the following common problems: Expand Contractions, Lower Case, Remove Punctuations, Remove words and digits containing digits, Remove Stopwords, Rephrase Text, Stemming and Lemmatization, Remove White spaces
- Apply various text-generation techniques such as N-grams LMs and Neural LMs
- Applying the mechanisms of attenuations and transformers to seq2seq problems
- Apply special data preprocessing techniques and architectures like Bert to the NER problem
- Apply modern architecture Bert
Course Contents
- Word embedding
- Text classification
- Text preprocessing methods
- Language Modeling
- Seq2seq models
- Named Entity Recognition
- Domain Adaptation
- Transfer learning
- Question Answering
- Topic Modeling
- Text generation
- Text summarization
- Style transfer
Bibliography
Recommended Core Bibliography
- Introduction to natural language processing, Eisenstein, J., 2019
- Yu, C., Wang, J., Chen, Y., & Huang, M. (2019). Transfer Learning with Dynamic Adversarial Adaptation Network. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsarx&AN=edsarx.1909.08184
Recommended Additional Bibliography
- Aman Kedia, & Mayank Rasu. (2020). Hands-On Python Natural Language Processing : Explore Tools and Techniques to Analyze and Process Text with a View to Building Real-world NLP Applications. Packt Publishing.