We use cookies in order to improve the quality and usability of the HSE website. More information about the use of cookies is available here, and the regulations on processing personal data can be found here. By continuing to use the site, you hereby confirm that you have been informed of the use of cookies by the HSE website and agree with our rules for processing personal data. You may disable cookies in your browser settings.

  • A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Words Embedding in Form of Symmetric and Skewsymmetric Operator

Student: Koshchenko Ekaterina

Supervisor:

Faculty: School of Computer Science, Physics and Technology

Educational Programme: Applied Mathematics and Information Science (Bachelor)

Year of Graduation: 2019

Words embedding is a representation of words with real-valued vectors. They were originally created to solve natural language processing tasks. Examples of these tasks are document indexing, semantic analysis, and question answering. There is another approach to most of these tasks — language modeling. However, all language models use pre-trained word embeddings, which means constructing new word embedding models is still a relevant task. There are many word embedding models known: Word2Vec, GloVe, FastText, etc. Each one of these models due to words relations asymmetric nature represents each word with two real-valued vectors: central and context. This leads to problems like final embedding construction ambiguity and model training time extension. I introduce a new approach based on asymmetric information extraction that uses the advantages of Global Vectors model. Due to the reduction of asymmetric information impact on resulting words representations, the model converges faster and outperforms existing models on words analogies tasks. Index Terms — word embeddings, GloVe, word analogies, language modeling, natural language processing, matrix decomposition.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses