• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Personalized Image Generation By Text-To-Image Diffusion Models

Student: Ishutin Andrei

Supervisor: Viacheslav Meshchaninov

Faculty: Faculty of Computer Science

Educational Programme: Applied Mathematics and Information Science (Bachelor)

Final Grade: 10

Year of Graduation: 2024

This paper investigates subject-driven image generation using text-to-image diffusion models with a primary focus on disentanglement. Subject-driven generation aims to generate specific concepts ("my pet dog") in various contexts using diffusion models like Stable Diffusion. However, these models often suffer from the entanglement of subject identity with the background and other features, leading to artifacts and inaccuracies. Several established methods are compared, including Textual Inversion, DreamBooth, Custom Diffusion, and DisenBooth, with DisenBooth being analyzed in depth due to its unique approach to disentanglement. The paper proposes new data augmentation techniques and introduces trajectory plots as an evaluation method to visualize the trade-off between textual and visual alignment. Additionally, the paper explores several modifications to DisenBooth, including the usage of additional or modified loss functions, CLIP reference injection, and adapter-free approaches, to assess their impact on the disentanglement problem. The results reveal that while methods like DisenBooth show promise, disentanglement is still imperfect. Issues such as concept-context leakage and the lack of the generalization of identityirrelevant embeddings persist. The adapter is too weak for the method to be benefited from the augmentation. Also, changes to contrastive loss proves ineffective.

Full text (added May 20, 2024)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses