Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Image gen is streets ahead of music in terms of control, as long as you stick to the FOSS stuff as DALL-E is too limited. I’m only an observer for now and haven’t actually used it much, but both StableDiffusion and SDXL have ControlNet and a bunch of other things that let you, for example, draw a stick man in a specific pose and the AI will generate a realistic man in that pose. Or edit one specific part of the generation and continue iterating from there.

The day we get a similar level of control with AI music will be a dream come true for me. We really need stems or at least MIDI files for these tools to be more than just soulless jingle generators imo.



I've been using Krita with the Stable Diffusion plugin, it's pretty amazing to use at times. I often read critics say things like 'you can't do layers with generative AI' and, uh, nuh? Though you can't, say, generate a shadow with adjustable alpha transparency, this doesn't seem like something that's impossible to do with the technology eventually. To think the tools won't improve would've been like looking at MacPaint and saying that digital art will never be a thing because it's always going to be low resolution and monochrome.

What I'd love is Suno/Udio as a VST plugin. Being able to supply MIDI or audio samples to pull melodies from, to generate from arbitrary audio on a timeline.


To that end, look up LayerDiffusion. It works amazingly well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: