We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096 Ã 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU. Core designs include: (1) DC-AE: unlike traditional AEs, which compress images only 8Ã, we trained an AE that can compress images 32Ã, effective
{{#tags}}- {{label}}
{{/tags}}