Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What is the point of them trying to create this? That something like this would mostly be used to create disinformation and create chaos is easily understood before making something like this.

Truly irresponsible



There are legitimate uses of this tech, such as preserving voices of people losing them such as Stephen Hawking, or making it better for blind/low vision people to follow text and interact with devices. For that latter case having a more natural voice that is also accurate is a good thing.

I use TTS to listen to articles and stories that don't have access to an audiobook narrator. I've used some of the voices based on MBROLA tech, but those can grate after a while.

The more recent voice models are a lot higher quality and emotive (without the jarring pitch transitions of things like Cepstral) so are better to listen to. However, the recent models can clip/skip text, have prolonged silence, have long/warped/garbled words, etc. that make them harder to use longer term.


You're right, of course. Unfortunately, however, we're all just actors in a giant, multi-player, iterated Prisoner's Dilemma here. If I decide not to pursue human-level automated speech generation, or I end up developing it and don't release because it's "too dangerous," someone else will just come in behind me and take all that market share I could have captured.

It's like we're stuck in some movie that came out in 1994[0], or something. Except, in this version, everything is gonna up sooner or later, anyway. Might as well profit from it along the way, right?

Le sigh.

---

[0]: https://www.imdb.com/title/tt0111257/


At least one good use is for video games where the text of some dialogue is determined when you run the game. For example in a game I work on player chat is local and voiced by tts configured by the player for their character.


Move fast and break things (including organized society).

I can't even think of non malicious uses that are anything more than novelty or small conveniences. Meanwhile the malicious use cases are innumerable.

In a just world building this would be a severe felony, punished with prison and destruction of all of the direct and indirect source material.


Cancer that takes someone’s voice.


Agreed.

On the one hand, I would love this kind of tech to be available for entertainment purposes. An RPG with convincing NPCs that are able to provide a novel experience for every player? Sounds great.

On the other: this is fraught with ethical problems, not to mention an ideal tool for fraud. At worst, it could be used as a weapon for total asymmetrical warfare on concepts like media integrity and an ideal tool for character assassination; disinformation, propaganda, etc.

I would happy welcome a world where this stuff is nerfed across the board, where videogames and porn are just chock full of AI voice-acting artifacts. We'll adjust and accept that as just a part of the experience, as we have with low fidelity media of the past. But my more cynical side tells me that's not what people in power are concerned about.


This is what happens when you have an industry full of people "looking for challenging problems to solve" without an ethical foundation to warn them that just because you can build something doesn't mean you should.


The point is to spawn a new medium, you'll have to imagine harder how positive that could be as people with lots of ideas are not going to give them to you for free.

Perfecting the tech for wide-spread use has trade offs; need for caller id, ease of slandering until trust in voice uniqueness recalibrates, all of which is going to change soon anyway but giving only rich/bad actors the tech at first has its own set of trade offs. Head in the sand is the irresponsible way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: