Generally it's better to use a few 'words' (vectors) when creating an embedding using textual inversion, say 2-6, though any more than that can overwhelm the prompt (same as typing that many words in the prompt). All words are converted into these codes under the hood, which are quite small (just 768 numbers in SD 1.4 and 1.5, and 1024 numbers in 2.0 and 2.1). Textual Inversion is finding a code to represent a new word (or sentence) which Stable Diffusion doesn't currently know. It doesn't train as many parts of the model as full finetuning either I don't think, but does a pretty good job, and seemingly can be used with any other model with pretty good results (going by this tutorial, I've not tried that). LoRA is similar to finetuning the whole model (sometimes called Dreambooth), but tries to compress the result down using some math tricks, so that it can just be applied to a model as additions/subtractions to its existing calibration values. You can, of course, use the keywords the lora was trained with to whatever effect you'd like. They activate before the image generation starts and remain fixed, just like how checkpoints remain fixed during a run. Auto1111's webui activates LORAs by typing into the prompt area, but LORAs are not a token and cannot be used as such. will activate at step 10) or prompt travel. However, that also means LORAs cannot do neat tricks that embeddings can do, like activation/deactivation at a particular step (i.e. The approximation part allows us to do this within a second just before runtime instead of the several minutes and gigabytes of RAM required for full merging. You're merging your current model with the difference of the approximation of a fine-tuned model (your LORA) from the base model you trained on. Using a LORA in practice is a lot more like merging a model than like using an embedding. LORAs will be much better at the things that the model does not knowĪlso you can mix LORAs and TIs together :) Now, LORA is added on top of the model and LORA introduces new data as a result of the training so that new device we talked about - with lora you would be able to generate them much better than with TI. This also means TI may give you great results on one model and terrible on other If someone invents a new device and you would like the existing models to generate them, the TI trained on images of those devices will help the models guide towards it but since no such thing exists in the model, it can only go so far and will give you some approximation This assumes that the model can generate that stuff in the first place So you need a model and TI is like a map so you can reach the stuff in that model Textual Inversion is just a guidance to specific concept, it helps to get to what you want in the model So, assuming both are trained well: LORA will have a better quality, and here is why: I wonder if it's obvious or if i need to make a version 2.0 of this guide to make it clear.įirst of all - you need to make an assumption that both are trained wellīecause good TI can be better than bad LORA (and vice versa) In the example of the tutorial where i trained the concept "plum", i added the lora by clicking on the image icon and got BUT apart of that i had to add the word "plum" to the prompt.Ĭheck the last image of the guide (the one in 80's anime style), you can see the prompt at the bottom have the lora and "plum" in the prompt.Īdding the lora alone is no good enough, the word of the subject trained has to be added to the prompt. I forgot to clarify a thing in the tutorial, apart of add the LoRA your prompt, you have to add the trained subject in your prompt to get the best results! Still, it seems lots of people don't know how to use them or how exactly make a dataset, so i hope this guide helps them. LoRA colabs are already fairly intuitive (click this, click that) and most of the settings are already pre-made so you just has to run it. Big Edit: This might be broken since colab was updated, Version 3 is here
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |