-
-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comparison discussion #3
Comments
Hello, would you please provide your weights (including the checkpoint & lora needed if you use lora) for your original image? I need them to reproduce your results in an oil-painting fashion. The MultiDiffusion results can be severely affected by the model checkpoints & lora you used. But generally speaking, extraordinary high CFG Scale, and slightly higher denoising value will give you satisfying details. Example positive prompts are "highres, masterpiece, best quality, ultra-detailed unity 8k wallpaper, extremely clear, very clear, ultra-clear". You don't need anything concrete things in positive prompts; and then, drag the CFG Scale to an extra-large value. Denoising values between 0.1 and 0.4 are all OK but the content will change accordingly. Here is my result of CFG=20, Sampler=DPM++ SDE Karras, denoising strength=0.3 for example. As I use the protogenX34 checkpoint, my painting style will be wildly different from yours: Please comment on this issue if you find your results have significantly improved after you use proper model and CFG values. |
Hi there, I will write here to not create new "issue" about similar thing. |
Click Here for Better Comparison View
|
https://imgsli.com/MTYwOTcx same here again |
Hello, thanks for your interests in this work. I tried for several minutes on your image and here is my result with no tuning: It's hard to tell what is better; if you like illustration-style sharpness and faithfulness to the original image, may be Ultimate SD Upscaler + 4x Ultra Sharp is your best choice. But personally I'd like to see some fabricated details on realistic human face, so I prefer this tool. It's noteworthy that, the biggest difference between MultiDiffusion and other upscalers is that currently it doesn't support any concrete contents when you upscale a image, otherwise each tile will contain a small character and your image finally becomes blur and messy. The correct prompts is just as follows. I even don't use lora: And my configurations, FYI: |
Update: Oh I just noticed that, EasyNegative is a textual inversion from civitai.com, it is not a word. Please download that textual inversion. Here is the link: https://civitai.com/models/7808/easynegative The Upscalers are important too. I personally use two: 4x-UltraSharp and 4x-remacri. Here is the link: |
I used it with the image above
Already downloaded this embedding |
Do you use EasyNegative embeddings? You mean you have used it in the above images? |
It may not be as easy as the Ultimate Upscaler to use, as it's essentially a completely redraw without post-processing. Personally I have some intuitions to use it:
|
I just did it and it's a lot better Settings (Even seed is the same): But still it can't generate a result as good as yours |
I'm also confused. Are you using this model? https://civitai.com/models/3666/protogen-x34-photorealism-official-release I see our model hash is different. Except from this I couldn't find something else. |
Yes, I used protogen_x3.4, but pruned Very huge improvement in details: It still not produces the exact same result as yours, I quess it depends on a hardware, but details are unbelievable, I can clearly see stitch seam on the sleeve |
Oh thanks for your feedback. I don't know that pruned model can affect the details too before you test it. |
Ohh! I think not many knows that to be honest o_O As much as I understand pruning, it should not affect such task as upscalling via small tiles? I gonna try with not pruned model as well and let you know. Edit. No clue but today everything works as it should. Maybe Its needed to turn off and on everything, not just to restart UI - just like during installing Dreambooth |
tried it and to be honest esrgan upscalers do 99% of the lifting, it barely does anything when used with lanczos, unless theres gonna be examples of it with lanczos where it introduces new details ? Best bet is to just upscale with esrgan by 2 and go to inpaint with it to mask the parts one by one to upscale them since you gonna have more pixel area to resolve detail, so unless someone will automate that , its gonna stay as the best way to upscale |
https://github.com/dustysys/ddetailer.git try this one |
I’m sorry for accidentally wrong edit. This is basically a tile-by-tile img2img SD redraw. So if you don't give it high strength it doesn't work as you expected. However, one of the weakness is that it currently cannot automatically map your prompts to different areas... If you can use stronger prompts, it should be way better. But I'm working on Automatic Prompt Mapping. In img2img, it works by first estimate the attention map of your prompt to the original picture, and then re-apply them to multidiffusion tiles. In txt2img this may be similar, but I need time to do so. |
The key point is that I need a user interface to draw bbox, so that you can draw rectangles and control the MultiDiffusion with different prompts. In this way the result should get way better. Why? because in this way you can just select the woman's face and tell SD to draw a beautiful woman's face. Then the SD will try his best, using his 512 * 512 resolution to ONLY draw a face. The resolution will be unprecedentedly high for SD models, as he dedicated to draw only one part of the image at the best of his capabilities. However, when I was adding features I saw this f**king issue: Some one pr a bbox tool but the officials denied the merging: I don't know what are they thinking in mind to deny such a good PR (from my perspective) but don't provide their own solutions. It has been a half year since it was first proposed. So it will be hard to draw rectangles on images directly. I must find another way to draw rectangles. Do you have any other idea? |
Check out this extension: https://github.com/hnmr293/sd-webui-llul It fakes it by having you move around a rectangle in a separate window. |
New Feature: "ZOOM ENHANCE" for the A111 WebUI. Automatically fix small details like faces and hands!Hello, fellow Stable Diffusion users! I'm excited to share with you a new feature that I've added to the Unprompted extension: it's the If you're not familiar with Unprompted, it's a powerful extension that lets you use various shortcodes in your prompts to enhance your text generation experience. You can learn more about it here. The The shortcode allows you to automatically upscale small details within your image where Stable Diffusion tends to struggle. It is particularly good at fixing faces and hands in long-distance shots. How does it work?The Features and Benefits
How to use it?To use this feature, you need to have Unprompted installed on your WebUI. If you don't have it yet, you can get it from here. Once you have Unprompted, simply add this line anywhere in your prompt: |
I have investigated a new technology DDNM (https://github.com/wyhuai/DDNM) that is very powerful in super-resolution. And it is also compatible with MultiDiffusion. Through initial test I found it is amazing. I believe this can beat their new feature in a compelling way. The automatic mask technology seems not very compatible with multi-diffusion txt2img but I will try in img2img |
The point is, the region control is supposed to be used to help avoid such things while composing larger images. This is literally the point of it, and what the demonstration pictures were showing. |
Have you tried what I suggested what I suggested? |
You realise your suggestions are totally irrelevant, right? Like, the point of region prompting is that you can have a larger image (with a background prompt using the usual MD merging), and then a specific foreground region (or regions) that are meant to contain specific things. It's even spelled out on the main page, including that tile size doesn't really matter for this one. Creating an image at 512 or 768 and upscaling also completely defeats the point, which is that your standard SD-sized generation would only be a component of an image with a different aspect ratio, and not full of body part concatenation. (I think it might actually be that some quality-related things tend to act as catalysts for drawing people; I'm not sure and I'm going to keep poking away) |
Image height is 1024. 8 tiles are in a batch so a height of 128 prevents the problem you are having. You can see my settings in the screenshot. |
Looking at the command line, multidiffusion was doing its thing. Probably because of region control. Which, again, to reference the MAIN PAGE FOR THIS REPO, says (with regard to region prompt)
seriously, do you think the person maintaining this knows less about how it works than you do? |
He is probably referring to it in the context of img2img, not txt2img. |
That's like saying a guitar player knows more about how an amplifier works. |
No. It's like the thing I said. You can't just come up with a different analogy to discredit my first one. |
In fact that is the exact opposite of my original analogy which is that the artist can utilize the tool better than the creator. It does not imply that the artist has the ability to design or create the tool. |
Thank you for making attempts on this. This is a classical noise pollution problem where the foreground noises triggered the undesirable multi-character change in the background, when your model is not that good for high resolution image generation. This can be partly mitigate by adding some negative prompts in the background regions. However, this may not solve the problem totally. I am considering a much more powerful merging strategy and corresponding ui that lets you fuses images better. you will definitely like it. |
Hello, I am trying to use Multidifusion to place kemono characters in the background, but the checkpoint I am using requires Hires fix and hypernet to be enabled by default, otherwise it will generate humans. The overall prompt words only describe the camera and background, as well as enabling hypernet. Enter character prompts for the foreground, and no prompts for the background. The first few steps of the denoising process can generate kemono normally, but in the end, the Hires fix transforms the character into a human. I tried to reduce the denoising value of the Hires fix, but it will result in fewer and more blurry image details. Increasing the denoising will make the character more like a human. I don't know if this situation is due to the incompatibility between Hires fix and Multidifusion or if hypernet did not start properly. |
I make a trial fix. Please switch to the dev branch and have a test. If it works please tell me on time. |
IT doesnt work well.The first image uses Multidifusion with Hires fix Denoising=0.7, while the second image does not use Multidifusion. I tried to turn off Hires fix when using Multidifusion in t2i and move the generated blurry image to i2i, but the background details did not increase. To be honest, it was only changed to high-definition, while Hires fix can add things that were not in the original image. I also tried the other three models proposed by the checkpoint author, neither of which requires Hypernet to be enabled. However, two of these models also encountered a problem with character image changes when opening both Hires fix and Multidifusion, while the other model was able to generate Kemono characters normally. https://civitai.com/models/11888?modelVersionId=32830 It has been verified that the model that can use Multidifusion normally is crossfemono2.0, while the models that cannot be used normally are G, G2, F, and D |
"RuntimeError: Invalid buffer size: 6.89 GB" How to solve it? |
Display 'min and input tensors must be of the same shape' with tiled vae |
Can the author show how to generate a realistic style of Qingming River painting through interface manipulation? This plugin will make it easier for me to understand tiling diffuser, area tips and drawing full canvas backgrounds. Thank you very much. |
To those of you asking questions on a closed discussion, you need to take some lessons from an old master at the art of asking questions online. |
Is there a setting that works with Intel 16-inch high-end model with 16g of RAM and AMD Radeon Pro 5500M with 6g of vram? And is there a distinction between Python and PyTorch versions that work? Currently, the desired image size cannot be created in Python 3.10.12 and PyTorch Nightly 2.1.0. If R-ESRGAN 4x+ scale exceeds 1.7 in 512 size, cmd will exit with an mps shortage error. |
MultiDiffusion Seems to be doing worse (not sharp) or am i doing something wrong?
original:
MultiDiffusion:
Ultimate SD Upscale:
The text was updated successfully, but these errors were encountered: