Bachelor
2023/2024
Computational Linguistics
Type:
Compulsory course (Applied Mathematics and Information Science)
Area of studies:
Applied Mathematics and Information Science
Delivered by:
Department of Applied Mathematics and Informatics
When:
4 year, 3 module
Mode of studies:
offline
Open to:
students of all HSE University campuses
Instructors:
Sergey Slashchinin
Language:
English
ECTS credits:
4
Contact hours:
44
Course Syllabus
Abstract
Kurs napravlen na podgotovku spetsialistov, sposobnykh provodit' informatsionnoye modelirovaniye predmetnoy oblasti i reshat' prikladnyye zadachi obrabotki informatsii na vysokom tekhnicheskom urovne. Prakticheskiye zanyatiya sluzhat dlya polucheniya ustoychivykh navykov obrabotki yestestvennogo yazyka s ispol'zovaniyem sovremennykh vysokourovnevykh yazykov programmirovaniya v kachestve prikladnogo programmistaDlya vypolneniya zadaniy ispol'zuyetsya skriptovyy yazyk Python3, a takzhe tekhnologicheskaya platforma Anaconda4. Dlya osvoyeniya uchebnoy distsipliny, studenty dolzhny vladet' sleduyushchimi znaniyami i kompetentsiyami: • sovremennyye metody proyektirovaniya i realizatsii informatsionnykh sistem; • osnovnyye algoritmy i struktury dannykh dlya bystrogo poiska informatsii; • programmirovaniye na yazykakh S, C++
Learning Objectives
- The objectives of mastering the discipline "Computational Linguistics" are the formation of students' clear understanding of the place and role of modern data extraction systems, mastering the theoretical foundations of modeling and processing information in natural language, understanding the trends in the development of the industry and the direction of prospective research, students' study of the principles of building modern information retrieval systems
Expected Learning Outcomes
- Be able to process texts using basic algorithms
- Be able to use vector representations of texts to answer queries
- Be able to use a probabilistic model to search for information in the text
Course Contents
- Basics of text processing
- Transformer models and their applications to various natural language processing tasks
- Language modeling and text representation methods
Assessment Elements
- Laboratory work 1. Collection of text corpus
- Laboratory work 2. Processing and classification of texts
- Laboratory work 3. Topic Modeling
- Laboratory work 4. Generating texts using a neural network language model
- Laboratory work 5. Modern problems of language analysis
- ExamTickets, theory included, + additional questions/tasks
Interim Assessment
- 2023/2024 3rd module0.2 * Exam + 0.16 * Laboratory work 1. Collection of text corpus + 0.16 * Laboratory work 2. Processing and classification of texts + 0.16 * Laboratory work 3. Topic Modeling + 0.16 * Laboratory work 4. Generating texts using a neural network language model + 0.16 * Laboratory work 5. Modern problems of language analysis
Bibliography
Recommended Core Bibliography
- Derivatives analytics with Python : data analysis, models, simulation, calibration and hedging, Hilpisch, Y. J., 2015
Recommended Additional Bibliography
- Image analysis, classification, and change detection in remote sensing : with algorithms for Python, Canty, M. J., 2019
- Learning Python : [covers Python 2.5], Lutz, M., 2008