Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In voice assistants, robocalls, e-books, even singing, live voice interpretation/translation... a lot of stuff.


Voice assistants -> Siri sounds just fine

Robocalls -> I want to know I’m speaking to a robot

Audio books -> reasonable. An accurate tone is pleasant

Singing -> ever heard of vocaloids? They’ve existed for at least a decade or two


> Singing -> ever heard of vocaloids? They’ve existed for at least a decade or two

That it was technically already possible does not mean there isn't benefit from improved quality. In fact, Vocaloid itself has been improving and now uses AI.

Would also add making movies, podcasts, news broadcasts, etc. available automatically in a huge range of languages. You wouldn't want movies dubbed by Microsoft Sam (beyond initial comedic effect).


> You wouldn't want movies dubbed by Microsoft Sam (beyond initial comedic effect).

You'd be surprised how common something like this used to be in Poland, though admittedly we used an Ivona voice for this, which was a lot more pleasant.

Having a single narrator narrate the entire movie, overlaying the original audio track, is already common here, much more so than dubbing or subtitles. This is for historic reasons, in the communist era, obtaining the raw audio tracks for dubbing was often impossible, all the translators often had was a normal copy of the movie in its original language.

In the early 2000's, we had a lot of early / unofficial pirate releases, and they had to be translated into Polish somehow. Subtitles were certainly one method, but as we're all used to the single-narrator style, many people didn't mind listening to a somewhat decent synthetic voice instead.


Games -> AI-powered characters that interact with you in realtime

Commercials/tutorials/corporate training videos -> Voiceover work

TV shows -> Dubbing in various languages

Fast food drive-throughs -> Taking customer orders


> robocalls

E.g. scamming. For anything that is just about conveying information through audio, like voice assistants, traditional TTS already works fine.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: