Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I got curious about something with tonal languages: how are song melodies written for them. For some, the melody matched the tones of the word. Then there's Mandarin. Mandarin just follows the melody, and you can figure out the word by context. As an English speaker, this makes ~cents~ sense. Homophones aren't a big deal. If Mandarin doesn't need tones in lyrics, why does it need them normally?


Interestingly, Cantonese songs tend to preserve tone better than songs sung in Mandarin. The paper "Tone and Melody in Cantonese" by Marjorie K.M. Chan [1] mentions the following:

> For Chinese, modern songs in Mandarin and Cantonese exhibit very different behaviour with respect to the extent to which the melodies affect the lexical tones. In modern Mandarin songs, the melodies dominate, so that the original tones on the lyrics seem to be completely ignored. In Cantonese songs, however, the melodies typically take the lexical tones into consideration and attempt to preserve their pitch contours and relative pitch heights.

[1] https://journals.linguisticsociety.org/proceedings/index.php...


Tangential fact - while, as you described, most popular songs in Cantonese have tones matching the melody (usually the melody is written first, then lyrics are filled in), many songs from Christian churches don't follow this practice. (I don't know why, maybe a lack of lyricists for translation from English/Latin to Cantonese during earlier years?)

So, in Hong Kong, when somebody writes a song/lyric that doesn't quite have matching tones, we ask "which church are you from?" to make fun of it.


> many songs from Christian churches don't follow this practice. (I don't know why, maybe a lack of lyricists for translation from English/Latin to Cantonese during earlier years?)

If you want the Cantonese lyrics to be translations of English lyrics, it severely limit the words you can choose, to the point it is impossible to fully match the melody.


It needs them because there are too few unique syllables in Mandarin. I'm sure a linguist can provide the proper terminology, but there are only around 400 unique sounds in Mandarin, ignoring tones. Even adding five tones still only increases this to ~1500 (not all are used). Compare this to English, where estimates are in the 10-15k range.

There are therefore an enormous number of homophones in Mandarin, which makes it very challenging to comprehend without context. I've often had native speaking friends eavesdrop on a conversation, only to tell me that they're not sure what is being discussed.

It also means that the language cannot be usefully written phonetically, and thus unique characters are required.

some discussion here:

https://chinese.stackexchange.com/questions/40574/why-does-m...

https://chinese.stackexchange.com/questions/39695/does-chine...

https://chinese.stackexchange.com/questions/14596/how-many-s...


Having few unique syllables doesn't mean tones are required, since syllables can be combined. Most Mandarin words are disyllabic or longer, and 400×400 = 160k is enough combinations for a quite large vocabulary.

Unique characters being required to distinguish homophones in modern written Mandarin is mostly a circular effect due to the characters already being available, so people use them in ways that would be ambiguous when read aloud (as intentional puns or simply to be more concise.)

If there had been no preexisting writing system and written Mandarin was a simple transcription of spoken Mandarin, introducing characters would be about as helpful as indicating the Indo-European roots of words in English writing, which is to say that some people might get a feeling of epiphany after realizing the connection between seemingly disparate words, but it would hardly be practical for everyday use.


Evidence that Chinese can be perfectly understandable written without the use of characters can be seen in the Dungan language (https://en.wikipedia.org/wiki/Dungan_language), which can be considered a dialect of Mandarin Chinese, but is written in the Cyrillic alphabet.

> Unique characters being required to distinguish homophones in modern written Mandarin is mostly a circular effect due to the characters already being available, so people use them in ways that would be ambiguous when read aloud (as intentional puns or simply to be more concise.)

Indeed, because of the way Dungan is written, it ended up evolving differently with respect to how new vocabulary is derived, often borrowing words phonetically from Russian instead of constructing them from Chinese morphemes that might otherwise be considered ambiguous when used individually.


>Most Mandarin words are disyllabic or longer, and 400×400 = 160k is enough combinations for a quite large vocabulary.

While true, I'd bet that some combinations dominate because they sound better/are easier to pronounce.

Also just because you can technically differentiate 160k sound pairs doesn't mean you can do it in a noisy environment.

Japanese and Korean have a similarly limited number of syllables and have very long words compared to English. I'm guessing because they don't have tones.

If you look at communication theory you don't only need distinct sounds, you also need error correction. Which requires extra bits of redundant information.

Tones just make it possible to carry extra bits.

Longer strings of syllables like in Japanese and Korean do the same.

More complex syllables, like in English, too.

It's just multiple different ways of carrying enough bits in speech to work in a noisy environment.

Another analogy could be password strength. You can have a very long numeric password (Japanese & Korean), A password with a mix of a-zA-Z0-9 of medium length (English). A password with weird special characters but shorter (Chinese), and they all end up having the same entropy (given that the password rules are known to the attacker).


There are language with even smaller sets of unique sounds that do not have tones like Hawaiian: https://en.wikipedia.org/wiki/Hawaiian_phonology

There are many common homophones in English that are distinguished by context. Conversation tends to have a lot of context. As a Mandarin speaker, I've never really experienced this context problem. You can make up some artificial examples in English and Chinese but they don't really reflect average communication. Like "The bat and the bow are on the table". It is important to know that a good percentage of words in Mandarin are multi-syllabic (not just one character).

Mandarin can be written phonetically perfectly fine. Currently the most popular systems are Hanyu Pinyin (used in China, Singapore and Malaysia) and Zhuyin (used in Taiwan). Kids learn these systems in school before they learn characters. Chinese characters have a strong historical and cultural value, that's why they are still around.


People simply have a LOT of romanticized bullshit views built around Chinese characters, or the relative difficulty of different ways of writing because they're fluent in the language and have spent thousands of hours immersed in a sinograph-based writing system. Of course a different writing system is difficult to read even if it's ultimately much easier to learn, you have no practice! It's like writing English in Latin script vs. writing it in runes, both work fine, but we're practiced on recognizing words in Latin script. ᛖᛚᛞᛖᚱ ᚠᚢᚦᚨᚱᚲ, ᚾᛟᛏ ᛋᛟ ᛗᚢᚲᚺ.

Vietnamese is written in alphabet without issue. The Dungan people of Kyrgyzstan and Kazakhstan even write their Mandarin-descended language with the Cyrillic script without any tone markings at all - the tones are supplied in a dictionary, but that's it. It works.

Most of the homophones etc. stuff come from people having decided that sinographs are good and then coming up with justifications for keeping them, not really an actual analysis whether Sinitic languages or Japanese would work without. This is a Chinese dictionary: https://imgur.com/a/rdxVh9i

> Mandarin can be written phonetically perfectly fine.

To reinforce this to the readers: https://www.pinyin.info/readings/pinyin_riji_duanwen.html

The author is a native Mandarin speaker who specifically requested that her work not be rendered in sinographs. It should be standard Pinyin orthography except that the author writes 'de' as 'd'.


> Of course a different writing system is difficult to read even if it's ultimately much easier to learn, you have no practice!

Yes, people often confuse the "way I do it", "the way it's always been done" or the "official way" as the only way it can be done.


You could also write English as an abjad with no vowels but not sane person would consider it. You can aslo splel einlsgh lkie tihs and msot people colud raed it flriay esilay.[4] The fact that your type demand Chinese writing to not only be phonetic but also not have tones is pretty telling that your motivation for using phonetic writing has pretty much nothing to do with "it's easier" or "it's phonetically regular" but just from some sort of disdain for the Chinese language in general. These sorts of phonetic reforms also require writing in a style that is essentially newspeak on steroids, such as your second source, which uses no vocabulary above maybe a 2nd grade level, and yet still I couldn't figure out what some of the words were supposed to be.

Here's another quote from the source you use:

> "There is no doubt that romanized Classical Chinese would be gibberish"

Invariably these proponents of phonetic writing for Chinese are non-native speakers[1] from the west who seem to have an intense hatred for any aspect of the Chinese language that they consider "Classic Chinese" derived[3]. This of course extends to any sentence that goes beyond "where's the bathroom" and "hello my name is bob" except not even the second example because Chinese names are what these people would consider "classical derived". So you propose a system that would not be able to transcribe __names__. Go to Korean wikipedia and click on a disambiguation page[0]. Or go ask them to show you their ID card[2]. These are a people whose entire national identity is based around not using Chinese writing. A lifetime of both native chinese speakers and non-chinese alike not being able to pronounce my NAME right when rendered in Pinyin is apparently not evidence enough that it's an inadequate system.

> This is a Chinese dictionary: https://imgur.com/a/rdxVh9i

You also leave out that double digit percentages of the Dungan language comes from Arabic and Persian, Russian, Turkic etc. Not even their names are Chinese. What little Chinese is left is a fraction of the amount of Chinese morphemes a normal Chinese speaker knows. Even in your example the entry for "da" has 10 semantically, phonetically, and etymologically different entries. The PRC also tried to enforce phonetic writing on the Yi and Zhuang languages, which had their own scripts that work on the same principles as Chinese. The result was low literacy rates and a population that predominantly still used the old writing system.

I could very well turn your argument against you. Why doesn't English spell pique, peak, peek the same? Pours, pores, poors? Why did a phonetic writing system slowly evolve into what is essentially a logographic script. Why were you able to read the above example relatively easily, but sdrow eht esrever I fi ylkciuq sa ylraen ton? It's almost as if mature readers of all scripts focus primarily on morpheme clusters when reading, and whatever gains you have from supposedly phonetically regular spelling are offset by that, assuming no pronunciation differences of course. By the time you force everyone to either memorize the "proper" pronunciations or simply force them to only use your privileged dialect your orthography will already be out of date. You can reform again, but by then your lexicon will be so etymologically and semantically starved[6] that you'll probably have to construct all your technical terms from some dead language with a stable orthography anyways.

> an actual analysis whether Sinitic languages

It's called general Chinese. The only phonetic system that works for most dialects, and whose spelling requires the same amount of memorization as writing with logographic characters. Of course if your kind had your way, by the time you could force it on every Chinese speaker it would be out of date and not even regular anymore. Of course these discussions usually don't even touch on the concept of morpheme regularity.

Of course all this text is useless because you probably don't speak Chinese well enough to evaluate any primary source, and the motivation for all this is less rational and more a personal vendetta you non-native speakers hold against Chinese being "too hard to learn"[5]. What's funny is it's the same sentiment you expats have for Vietnamese and Korean, Arabic or even Dutch. Even if we lobotomize our language for your sake you'll simply demand we all adopt English anyways.

[0] https://ko.wikipedia.org/wiki/%EC%88%98%EB%8F%84_(%EB%8F%99%...

[1] or some sort of deranged newspeak proponent, usually diaspora

[2] https://learn.microsoft.com/en- us/answers/questions/815368/acceptable-types-of-identification-%28az-900-test%29?orderby=newest

[3] Usually the argument against 施氏食獅史, somehow a several sentence long story every native chinese reader would understand being rendered as gibbereished shi shi shi shi shi shi, or maybe shi Shi shi shi shi if you're generous, is a totally reasonable reform in your eyes.

[4] https://www.ddginc-usa.com/can-you-read-this.htm

[5] Not limited to language apparently, no cultural differences can be tolerated by you globalists types. Even chopsticks compel your type to proclaim > "Really? A fork and a spoon is far more superior. It shocks me that chopsticks are still used and that people like using them" https://news.ycombinator.com/item?id=35877051

[6] > Romanticized bullshit views built around Chinese characters.

Leads to Oxymoronic statements where Refusing To "Romanize" is because of "Romanticism". How absurdity like this is supposed to be easy for non-native learners and native children to grasp is beyond me.


I'm surprised you've never experienced this. Even names often require an explanation, because the pronunciation is insufficient to convey which words (i.e. characters) are used.

>Mandarin can be written phonetically perfectly fine

It can, and I use hanyu pinyin daily, but my point is that given the small space of possible sounds, it often has a great deal of ambiguity, and is mentally taxing to read. Have you ever tried reading an essay or book in pinyin? With syllabic spacing? There will be many places where it is simply not possible to know for certain what a particular word is. And then there are text books, scientific books.

Chinese characters do indeed have strong historical and cultural value, but that is not why they are still around. They are still around because they are essential to the written language.


> Even names often require an explanation

You can still write the name in Hanyu Pinyin or Zhuyin perfectly fine. It is just that we like character names and that most characters are valid to be used in names so there is a lot more flexibility in what can be a name versus other cultures where there is a less flexible set of names. You can still do something similar in English where you say your name is "rainbow" but you spell it "rhaynbeau", people aren't going to be able to guess that.

> given the small space of possible sounds

Again, see languages like Hawaiian and Vietnamese. They also have small sets of sounds and do fine with romanization.

> Have you ever tried reading an essay or book in pinyin? With syllabic spacing?

Yup, it is just that most people are used to reading Chinese characters and not in romanized Mandarin. There may be other advantages to Chinese characters like quicker recognition and occupying a smaller space, and I am not trying to advocate for eradication of Chinese characters, but I want to stress that is perfectly possible to read and write Mandarin phonetically and characters are not essential.

Also I read and write Taiwanese (Hokkien) in romanized form. Feels like a waste of time to worry about characters, but many people do and end up not writing Taiwanese or using mixed script.


Every forum post I've seen mentioning 白話字 and 台羅 mentions how hard it is to read and how few Hokkien speakers can even read it. The few proponents for it seem to be holding on for religious reasons (Presbyterians).

>You can still do something similar in English where you say your name is "rainbow" but you spell it "rhaynbeau",

This is an insulting borderline racist comparison and ties to the same old western trope of treating our names like random sounds. "rhaynbeau" Isn't a word and doesn't carry any meaning.


> Every forum post I've seen mentioning 白話字 and 台羅 mentions how hard it is to read and how few Hokkien speakers can even read it. The few proponents for it seem to be holding on for religious reasons (Presbyterians).

I am not sure what you mean by holding on for religious reasons? IThere are lot of reasons to write Taiwanese. Anyway, I don't know anything about these forums or have Presbyterian affiliation, but in my real life I use it quite often with friends and family. The reason few people can read it is because few people have learned it. For the majority of Taiwanese speakers it is only a spoken language. Written Taiwanese does not play a large role in public education in Taiwan.

> "rhaynbeau" Isn't a word and doesn't carry any meaning.

It's an imperfect example for non-Chinese speakers to illustrate that it can be hard to guess the character of another person's name but people still understand the sounds when hearing it. A lot of thought goes into choosing the characters for a Chinese name. Other cultures have names that are not related to meaning or are separated very far form the original meaning (the words are for names). Others allow variations on previous names or borrowing from other langauges so likewise those names might be challenging to know the spelling.


> They are still around because they are essential to the written language.

This argument used to be made in Korea, yet the country seems to have transitioned to alphabetic writing without issue. A lot of the tax of reading phonetic scripts of Chinese or Japanese is that fluent speakers are simply not at all used to it, even if they can read it.

For example:

ᚦᛁᛋ ᚨᚱᚷᚢᛗᛖᚾᛏ ᚢᛋᛖᛞ ᛏᛟ ᛒᛖ ᛗᚨᛞᛖ ᛁᚾ ᚲᛟᚱᛖᚨ, ᛃᛖᛏ ᚦᛖ ᚲᛟᚢᚾᛏᚱᛃ ᛋᛖᛖᛗᛋ ᛏᛟ ᚺᚨᚹᛖ ᛏᚱᚨᚾᛋᛁᛏᛁᛟᚾᛖᛞ ᛏᛟ ᚨᛚᛈᚺᚨᛒᛖᛏᛁᚲ ᚹᚱᛁᛏᛁᛝ ᚹᛁᚦᛟᚢᛏ ᛁᛋᛋᚢᛖ. ᚨ ᛚᛟᛏ ᛟᚠ ᚦᛖ ᛏᚨᚲᛋ ᛟᚠ ᚱᛖᚨᛞᛁᛝ ᛈᚺᛟᚾᛖᛏᛁᚲ ᛋᚲᚱᛁᛈᛏᛋ ᛟᚠ ᚲᚺᛁᚾᛖᛋᛖ ᛟᚱ ᛃᚨᛈᚨᚾᛖᛋᛖ ᛁᛋ ᚦᚨᛏ ᚠᛚᚢᛖᚾᛏ ᛋᛈᛖᚨᚲᛖᚱᛋ ᚨᚱᛖ ᛋᛁᛗᛈᛚᛃ ᚾᛟᛏ ᚨᛏ ᚨᛚᛚ ᚢᛋᛖᛞ ᛏᛟ ᛁᛏ, ᛖᚹᛖᚾ ᛁᚠ ᚦᛖᛃ ᚲᚨᚾ ᚱᛖᚨᛞ ᛁᛏ.

Same normal English, but Elder Futhark as the script. If you grew up reading that you'd read without issue. Now? It's a pain.


[flagged]


You've broken the site guidelines badly with this flamewar post, even stooping to personal attack. That's totally not ok.

I'm not saying that your points on the underlying topics are wrong–for all I know you're 100% right—but you can't abuse HN like this, no matter how right you are or feel you are. As you've broken HN's rules many times in the past and ignored our repeated requests to stop, I've banned the account.

https://news.ycombinator.com/item?id=33904225 (Dec 2022)

https://news.ycombinator.com/item?id=27830573 (July 2021)

https://news.ycombinator.com/item?id=22713311 (March 2020)

https://news.ycombinator.com/item?id=22591936 (March 2020)

https://news.ycombinator.com/item?id=20712243 (Aug 2019)

https://news.ycombinator.com/item?id=20191623 (June 2019)

It's a pity, because you're clearly knowledgeable on some of these topics and I hate to ban a knowledgeable user. But we don't have a choice when people break the rules like this and don't respond to warnings. If you don't want to be banned, you're welcome to email hn@ycombinator.com and give us reason to believe that you'll follow the rules in the future. They're here: https://news.ycombinator.com/newsguidelines.html.


> but there are only around 400 unique sounds in Mandarin, ignoring tones. Even adding five tones still only increases this to ~1500 (not all are used). Compare this to English, where estimates are in the 10-15k range.

Sure, but English needs them because there are only around 26 letters; compare this to Mandarin, where estimates are in the 400 range.


Why would a small number of letters make English need more syllables?

If anything, letters being overloaded limits the number of syllables we can express.


I was mostly just joking/pointing out that the comparison's a bit.. not to say 'apples and oranges', but a bit arbitrary, it seemed to me could just as well be comparing letter count - or even that that makes more sense as a comparison for (disregarding tone) characters, but still arbitrary, the languages just work differently.


Just because you can figure it out by context in songs (rarely upon the first listen, mind you), that doesn’t mean the added cognitive load isn’t excessively burdensome in everyday speech.


Realizing no one's going to change a language with 900 million speakers, do you think it's because there's a lot of ambiguity, or is it because it's a cognitive load people aren't used to? Mandarin is a newer language than Cantonese, and it has fewer tones. Languages tend towards laziness, so I wonder if it settled on the right number, of if it's an ongoing trend.

Edit: About languages losing features, English used to be declined like German or Latin. Only pronouns are declined in modern English, and we don't usually teach it as "pronouns are declined."


> Mandarin is a newer language than Cantonese

Both languages descended from a common ancestor, so you can't necessarily say that one is newer than the other. However, it is the case that Cantonese preserves several features that Mandarin has lost, in particular the complete inventory of final consonants and all of the tone categories of Middle Chinese, which makes it seem better suited for reciting 1000+ year old Tang dynasty poetry where rhyming and tones were especially important.

On the other hand, Cantonese has lost other features that Mandarin has preserved (such as medial vowels and the three-way distinction of initial sibilant consonants), but these features aren't as critical with respect to reciting Tang poetry. For this reason, Cantonese may seem "older" than Mandarin, even though in reality, it's simply that they each have preserved different features and the features that Cantonese preserved happened to make it better for reciting old poetry.

> Languages tend towards laziness, so I wonder if it settled on the right number, of if it's an ongoing trend.

All languages change and will continue to change over time, and while laziness may drive changes in some features of a language, often times other parts of the language become more complex to compensate. This process is called grammaticalization, and is thought to occur in cycles: http://websites.umich.edu/~jlawler/TheGrammaticalizationCycl...


Just like human are newer than, say, some monkey because it comes later, even we evolve from same ancestor, is a matter of fact. The recitation of tang poetry and more complicated speak a lot of this. Mandarin is later.


Stop embarassing yourself in public.


I suspect what's going on here is that in music, it doesn't matter if you understand it right the first time.

How many songs do you misunderstand the lyrics for on the first few listens, in your native language? For me, in English, I either can't tell exactly what they're saying for some proportion of lyrics, or just totally mishear them _quite_ often (especially depending on the genre).

Music doesn't require every word to be perfectly understandable. Communication does, ideally.


> I suspect what's going on here is that in music, it doesn't matter if you understand it right the first time.

So there's this

https://www.youtube.com/watch?v=pdz5kCaCRFM

and more interestingly

https://www.youtube.com/watch?v=-VsmF9m_Nt8


Languages often lose features but they also gain features. Complexity of language is hard to compare, but we can still find many examples.

Modern English has less complex verbal morphology and noun declension (as you mentioned, only in pronouns). But the set of vowels in Modern English is more complex than that of Old English. Also the vocabulary of Modern English has two main sources: Germanic words (native) and French/Latin/Greek words where a single idea can be expressed in either vocabulary source with different nuances. Old English was mostly comprised of Germanic words with some words borrowed from Latin.

Another interesting thing to note is that languages without tones can gain tones (tonogenesis) and in Old Chinese tones played a much smaller role than the modern descendants. This is often the result of syllables/sound systems becoming less complex and losing contrast so the tone of the word becomes contrastive to maintain a distinction between words.


From my understanding Mandarin has a lot of two-syllable words and in many of the words the second syllable doesn't add much, if any, additional meaning.

Contrast that with Cantonese, which I believe still uses a single syllable for most words. (Someone please correct me if I'm wrong)

So it makes sense with less tones, because you have more syllables to disambiguate.


> which I believe still uses a single syllable for most words

Not sure about "most" (depends on the sample distribution I suppose), but single syllable (i.e. character) words are used much more often relative to Mandarin.

So in general you're probably right. Not sure whether that is a cause of more strict adherence to tones in songs though. It could be alternatively argued that the more complex syllable (due to more tones among other things) in Cantonese allowed it to retain single syllable words without having to add extra syllables to clarify any ambiguities.


> It could be alternatively argued that the more complex syllable (due to more tones among other things) in Cantonese allowed it to retain single syllable words without having to add extra syllables to clarify any ambiguities.

Yes, pretty much.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: