Names are a hard problem though. How would it know how "Bach" is pronounced? It seems you would need pretty advanced multimodal AI, some sort of GPT which is trained both on text and audio.
The problem I've had with Siri and music has nothing* to do with parsing individual words. What I've found recently is that if you don't give an exact match Apple just puts in random shit. Hey Siri play the album Rubber Soul by the Beatles on Apple Music gets me random songs by the Beatles because apparently I have Rubber Soul named "Rubber Soul [some edition info]". Hey Siri play songs by the band Duran Duran literally just plays the eponymous album because reasons. You don't need AI, machine learning, GPT, LLM, or whatever fucking buzzword is all the rage, you simply need to revert to behavior that was standard in iOS 15 and earlier. The upgrade to iOS 16 completely nerfed Siri on my phone, starting with some mandatory trial subscription bullshit.
It's to the point where I've given up trying to use Siri while driving.
* Almost nothing. I still have to say "play underground eight zero s on soma fm" because reasons.
Yeah, some others also had the impression that it got worse over time in some aspects. I wonder whether this was some kind of tradeoff with other abilities. Or perhaps they rewrote the code, which had unintended side effects.
Yes, GPT is particularly good understanding, even when you misspell things. It would make a good front-end interface to something like Siri. take the human input and make it something that makes sense to the dumb computer.
Yes, but large language models are very compute intensive and require a ton of RAM. So they wouldn't be able to run locally (this is currently possible with Siri), would be relatively expensive and possibly slow. So they might still be a while off.
The accurate pronunciation in German is irrelevant. Siri is always constrained to one language (Settings > General > Language & Region), and when set to English you get English pronunciations.
Same way Siri understands the english “Los Angeles” even though the G sound is completely different from Spanish.
Lots of Americans seem to try to pronounce Bach the composer "correctly", which leads to batch, bash, buck, ... which is fine, the German hard "ch" is very hard to form for English throats, and it's always better and more polite to at least try than to simply pretend foreign words and names are just weirdly spelled English ones, but it's not as straightforward as with a John Bach from Ohio.
So how would you have to pronounce "Chopin" in "English"? This doesn't make sense. There aren't even consistent pronunciation rules for many genuine English words, like "ead" in "read" and "thread". It's not even straightforward for English speakers to correctly pronounce "Eliezer Yudkowsky". Which means it's even harder for Siri.
Not only Siri — the whole iOS. You can’t type a sentence switching languages in the middle, without changing the keyboard language all the time, if you have autocorrect enabled. It will change what you type into utter gibberish, even though without the “correcting” what you type is perfectly correct. This system is quite visibly designed by people who speak only one language and don’t understand that people may want to use multiple languages at the same time. The keyboard should support a mix of languages, instead of making a XOR between languages, because otherwise when it starts, it’s almost always in the wrong mode, and if it isn’t, it will almost certainly be wrong by the end of what I write.
You’re talking as if there is an anccepted standard English pronunciation of Bach. The only one I know is the German one which I would use when speaking English. Perhaps I would soften the ending.
My point is that there's nothing particularly special about this name of foreign origin, compared to any other word. Every word has lots of variations in how they're pronounced.
The audio clip the person posted was for a true German pronunciation, which happened to be very different than how 99% of English-speakers would say it.
I've had multiple teachers teach me different languages (other than english), not one called me by my english pronunciation. It seems that it's just people who speak English that try to do this.
> it's a name of a concrete person which has only one correct pronunciation.
This is an insane standard. The [x] at the end of the German word doesn't exist in English; most English speakers wouldn't be able to pronounce it if they wanted to. When the demands you're making are literally impossible, the problem is you.
So just because the "th" sound doesn't exist in many languages, like German, they should pronounce "Heath Ledger" or "Anthony Hopkins" or "The Beatles" incorrectly? That seems to me a way more "insane" standard. By the way, the Scottish are perfectly able to pronounce "Loch Ness", which has the same sound for "ch" as "Bach".
> So just because the "th" sound doesn't exist in many languages, like German, they should pronounce "Heath Ledger" or "Anthony Hopkins" or "The Beatles" incorrectly?
They're going to use the sounds that exist for them, yes.
> That seems to me a way more "insane" standard.
I hope you never get to make any decisions. Dave Barry once wrote about someone thinking "What an idiot I am! Here I am, a Japanese person, in Japan, and I can't even speak English!"
But then again, Dave Barry was joking.
> By the way, the Scottish are perfectly able to pronounce "Loch Ness"
The population of Scotland is 5 million; if you want to talk about "most English speakers", the Scottish aren't even worth noticing.
> > So just because the "th" sound doesn't exist in many languages, like German, they should pronounce "Heath Ledger" or "Anthony Hopkins" or "The Beatles" incorrectly?
> They're going to use the sounds that exist for them, yes.
That wasn't the question I asked. They will at least try to pronounce "Heath Ledger" or "Chopin" correctly, they won't act as if there was a correct German way to pronounce those names.
I lived in Japan for a while. My name contains sounds that just didn't work for them. No one pronounced it correctly.
I was not upset, annoyed, or confused. It's just the way language acquisition works. You learn the sounds you need and the rest are hard to acquire later in life.
Be strict in what you send, forgiving in what you receive.
> It's just the way language acquisition works. You learn the sounds you need and the rest are hard to acquire later in life.
As a point of interest, this is actually backwards. You're born recognizing all the sounds; what you learn is to ignore the difference between sounds that aren't distinct in your language.
You do keep that ability for the rest of your life, but it isn't helpful when you try to learn to recognize foreign sounds.
But that's exactly the issue here when people use the correct pronunciation, which happens to be different than how normal words in their language are pronounced, but the voice assistant assuming normal language, which leads to absurd misfirings. The issue is not people not knowing how to pronounce something, the problem is that it's a hard problem for "dumb" AIs to know how a certain name is pronounced, as long as they are not multimodal LLMs.
I think there's something about sounds that you learn early on in language acquisition - maybe your brain develops differently.
'th' is the obvious one that non-english speakers struggle with. I remember a dutch guy laughing at my attempts at various dutch words - I literally could not hear the difference between his pronunciation and mine.
And 'ch' (as in Loch or Bach) is a sound in Scottish english but not in English english.
I lived in Scotland till I was 4, then moved to England and all traces of my previous Scottish accent are long long gone. But my friend, whose surname is Donnachie, says I'm the only English person she's met who pronounces her name correctly - I guess because I learnt that sound early on.
Similarly, my dad, who learnt english in India, still struggles with a "j" sound (he says "zudge" instead of "judge"), despite living here for 50 years and having a posh middle-class English accent that sounds just like a "native" english speaker.
I don't know if "th" exists in Polish or not, but a common (perhaps dominant) spoken way to refer to "The Beatles" is[0] "Bitelsi", which not only loses "th", but also like half the other sounds in the name[1].
Thing is, we understand it just fine. More than that, if you overheard me saying to someone, "puść teraz Bitelsów" ("put on the Beatles now"), there's a good chance you'd identify the name from context. If you didn't, you could always ask to verify (well, not if you were actually overhearing me...).
----
[0] - Or at least would look like that written down. Polish is mostly a "you say it as you see it" language, but with foreign names, often enough people write the correct form but use localized pronunciation.
Want to flummox the Japanese tongue? Try a sentence like "Darth Vader is Luke's father". It hits most of the highlights: interdentals, labiodentals, and that weird 'r' sound English has that Japanese sometimes tend to conflate with 'l'. Even a competent Japanese English speaker is likely to render it as "Dāsu Bēdā izu Rūkusu fazā". Depending on the region they may mess up the 'f'; the syllable 'fu' is actually 'hu', but pronounced with very pursed lips in Tokyo Japanese (not so much in Kansai).
Unless they're bilingual from childhood, most people are not able to pronounce sounds outside their milk tongue without difficulty. That you expect English sounds to be perfectly pronounceable by non-English-speakers is probably more reflective of the fact that quality English education is widely available where you live than anything.
That is completely wrong. People have many names in practice, especially historical persons. Even living people often present themselves differently in different languages.
For some examples:
- the famous Romanian/French modern sculptor Constantin Brîncuși (which uses a vowel that has no direct correspondent in either French or most dialects of English, and it pallatelizes the ending sh, so that it's pronounced in two syllables, brîn-cush with a slightly pronounced ee at the end), but also Brancusi (in French, roughly bran-cu-see).
- in Japanese, since Japanese speakers have relatively few syllables they are familiar with, almost all foreign names are expected to be Japanized; for example, if your name is "Stephen", you would be expected to present yourself as, roughly, "su-tee-ve-n", and write your name with the corresponding katakana characters in certain official documents
These products are just hilariously bad.