Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just a matter of curiosity: do people really think the Sky OpenAI voice is similar to ScarJo? While they both vary pitch a lot, Scarlett also adds great dimension by shifting rapidly between different tone qualities. Tone variation seems to be only barely detectable in Sky. Sky sticks to a pure tone, while Scarlett starts on a mildly harsh (but pleasant) tone.


Is there an objective, quantitative metric to compare two voices, including pitch, tone variations etc?


You train a neural network to distinguish between them and measure its accuracy. If they were the same voice, the accuracy in the eval dataset wouldn't be better than chance.


I find this comment a bit myopic.

First is the belief that the essence of a person can be reduced to quantified metrics as inherited and objective as height or, in this case, the shape of one's vocal cords and the resultant pitch of their voice. Second is using a glorified function approximator as an arbiter. The positive outcome for OpenAI would be a classifier able to achieve high accuracy in distinguishing between the original and impersonated voice as evidence that the voices are sufficiently distinct.

In the event of an underperforming classifier, what is preventing OpenAI from claiming that the current technology is simply not adequate? Under what grounds could one refute the claim that a future, more advanced system, very probably trained with more appropriated data, would be a better classifier and favor OpenAI?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: