Posted by u/tanzim31
LTX 2 is amazing : LTX-2 in ComfyUI on RTX 3060 12GB
My setup: RTX 3060 12GB VRAM + 48GB system RAM. I spent the last couple of days messing around with **LTX-2** inside ComfyUI and had an absolute blast. I created short sample scenes for a loose **spy story set in a neon-soaked, rainy Dhaka** (cyberpunk/Bangla vibes with rainy streets, umbrellas, dramatic reflections, and a mysterious female lead). Workflow : https://drive.google.com/file/d/1VYrKf7jq52BIi43mZpsP8QCypr9oHtCO/view i forgot the username who shared it under a post. This workflow worked really well! Each 8-second scene took about **12 minutes** to generate (with synced audio). I queued up **70+ scenes** total, often trying 3-4 prompt variations per scene to get the mood right. Some scenes were pure text-to-video, others image-to-video starting from Midjourney stills I generated for consistency. Here's a compilation of some of my favorite clips (rainy window reflections, coffee steam morphing into faces, walking through crowded neon markets, intense close-ups in the downpour): i cleaned up the audio. it had some squeaky sounds. **Strengths that blew me away:** 1. **Speed** – Seriously fast for what it delivers, especially compared to other local video models. 2. **Audio sync** is legitimately impressive. I tested illustration styles, anime-ish looks, realistic characters, and even puppet/weird abstract shapes – lip sync, ambient rain, subtle SFX/music all line up way better than I expected. Achieving this level of quality on just **12GB VRAM** is wild. 3. **Handles non-realistic/abstract content extremely well** – illustrations, stylized/puppet-like figures, surreal elements (like steam forming faces or exaggerated rain effects) come out coherent and beautiful. **Weaknesses / Things to avoid:** 1. Weird random zoom-in effects pop up sometimes – not sure if prompt-related or model quirk. 2. **Actions/motion-heavy scenes** just don't work reliably yet. Keep it to subtle movements, expressions, atmosphere, rain, steam, walking slowly, etc. – anything dynamic tends to break coherence. Overall verdict: I literally couldn't believe how two full days disappeared – I was having way too much fun iterating prompts and watching the queue. LTX-2 feels like a huge step forward for local audio-video gen, especially if you lean into atmospheric/illustrative styles rather than high-action.
External link:
https://v.redd.it/oe4t69zj7ydg1More from r/StableDiffusion
Surgical masking with Wan 2.2 Animate in ComfyUI
Surgical masking lets you preserve the original scene’s performance and image quality, keeping everything intact while...
How was this done? I've experimented a lot and nothing comes close to this guys work
Stickyspoodge admits to using ai in his work, and the hands and other tells in the full video show that it's clearly ai...