Posted by u/Mountain_Platform300
I think I figured out how to fix the audio issues in LTX 2.3
Been tinkering with the official LTX 2.3 ComfyUI workflows and stumbled onto some changes that made a pretty dramatic difference in audio quality. Sharing in case anyone else has been running into the same artifacts like the typical metallic hiss you'd hear on many generations: The two main things that helped: **1. For the dev model workflow:** Replacing the built-in LTXV scheduler with a standard BasicScheduler made a noticeable difference on its own. Not sure why it helps so much, but the audio comes out cleaner and more structured. Also use a regular KsamplerSelect with res\_2s instead of the ClownsharKSampler. **2. For the distilled workflow:** Instead of running all steps through the distilled model, I split the sigmas: 4 steps through the full dev model at cfg=3, with the distilled lora at 0.2 strength, then 4 steps through the distilled model at cfg=1. The dev model pass up front seems to add more variety and detail that the distilled pass then refines cleanly and the audio artifacts basically disappear. I'm attaching the workflow here for both distilled and full models if you want to try it. Would love to hear if this helps you out. Workflow link: https://pastebin.com/wr5x5gJ0
External link:
https://v.redd.it/z45nerbidkrg1More from r/StableDiffusion
Surgical masking with Wan 2.2 Animate in ComfyUI
Surgical masking lets you preserve the original scene’s performance and image quality, keeping everything intact while...
How was this done? I've experimented a lot and nothing comes close to this guys work
Stickyspoodge admits to using ai in his work, and the hands and other tells in the full video show that it's clearly ai...
Google's new AI algorithm reduces memory 6x and increases speed 8x