Wednesday, November 11, 2009

Why is Chinese hard?

I have been trying to learn Chinese since I got married a long time ago. It hasn't been easy, but at least now at NCKU as a full time student, I'm making progress.

There are many spoken Chinese dialects each with it's own pronounciation and tones. But they all share the same written language of characters. The dialect used for official business and education in both Taiwan and mainland China is the Beijing Dialect because Beijing was the capital of China at the founding of the Republic in 1911. In Taiwan it is called Guoyu(National Language) on the mainland it is called Putonghua(Common Speech). They are basically the same.
In what follows I will refer to this Beijing dialect as the Chinese spoken language.
 
I have some ideas on why Chinese is such a hard language to learn. I think it is because Chinese is two almost separate languages: the spoken language and the written language.

The Spoken Language

Like all languages, Chinese has a fixed number of sounds that make up words. In  Chinese there are 21 initial consonants and 16 final vowels. The consonants and vowels together make up an alphabet. Although there are several romanizations that specify this alphabet, they are all just different symbols for the same sounds used in the Chinese language. The Chinese spoken language exists independently of any romanization, all romanizations are just teaching aids for foreigners and children.

An initial consonant and a final vowel together becomes a syllable. Each syllable can have one of 4 possible tones. And a word is either one or two syllables with a character for each syllable. A good dictionary of Chinese might have 20,000 characters and 200,000 words or phrases, similar to the number of words in an English dictionary. 

There is a grammar for ordering words into sentences. And that is pretty much all that is needed to describe a spoken language. Every spoken language has to have these components: a fixed set of sounds, a vocabulary of words and a grammar for making sentences.

One difficulty with Chinese for foreigners is that the tones are difficult for those who mother language is atonal. Recognizing and reproducing those tones takes practice. But there are lots of tonal languages, all of the dialects of Chinese are tonal and for example all of the tribal languages of Liberia are tonal. I am particularly bad at tones.

The Written Language

For most European languages, words are written phonetically. Basically if you know the sounds of the alphabet, you can sound out the written word. Even for languages which had no written language, once an alphabet was devised, words could be written as they sounded in the spoken language.

Chinese is not this way. With a tradition of maybe 5000 years, the Chinese written language has developed with individual written characters for each word. At times there may be clues on how a word is pronounced from its character but that seems coincidental. So the written language has little connection to the spoken language. The grammar and vocabulary are of course the same, but the sound of a word is divorced from how it is written. So there is no "phonics" reading program for learning to read Chinese characters.

So whereas in a phonetic language, reading and speaking can reinforce vocabulary this is not the case with Chinese. The 20,000 characters of a good dictionary in Chinese each consist of one character and the literate person must learn to recognize each character and write each character. That's tough. And so you have a range of literacy depending on how many characters a person can recognize. I am one step above being illiterate.

Computers to the Rescue

Before the digital age, learning Chinese must have been the most tedious job on Earth. You have to be impressed by students that stuck with it and wonder about the people who did master the Chinese written language. On the other hand, there wasn't as many distractions as today, so maybe there was the time it would have required.

The tedium of looking up characters and memorizing them is just what computers do best. Nowadays a character can be searched for with just a few keystrokes of its romanization. Once the user enters how the word is pronounced, the computer, cell phone or PDA presents a list of possible characters. Frequency tables allow the device to present the list with "most frequently used characters" first. For non literary tasks like texting, this is all that is needed.

I have such a dictionary program on my linux box called Wenlin. It's more than enough for me. I also have a similar dictionary on my iTouch.

Homonyms

So why not just chuck the whole character world and just go with the romanizations? The problem is homonyms, these are different words that sound exactly a like.  For example, in English, "buy", "bi", "bye" and "by"  are all homonyms.

We can get a rough estimates about the number of homonyms in Chinese by doing a "back of the envelope calculation".

Number of distinct syllables in Chinese =

number of initial(21) x number of finals(16) x number of tones(4) = 1344

Some of these 1344 potential syllables are not used. So there are too few syllables to specify the 20,000 distinct characters in the Chinese vocabulary.

For example my simple student computer dictionary, Wenlin, has

                            distinct characters
yi(first tone)................27
yi(second tone)...........70
yi(third tone)...............36
yi(forth tone).............157

"yi" is might be a worst case senario but it illustrates the problem.

So no writing system as simple as a romanization will ever capture the complexity of the Chinese vocabulary. Chinese characters will be with us for as long as there are Chinese, so that means forever.

Of course in the spoken language there is also this problem of distinguishing homonyms but this is usually decided by context and the situation. It makes you realize just how much of communication is NOT verbal but situational. (Many jokes in Chinese are about misunderstandings about which homonyms the speaker is talking about.)

Advantages

Chinese characters have one obvious advantage, they take very little space to say a lot. Before the invention of paper this was a big deal but in the digital age it is less and less of an advantage. If you look at the two tourist signs below they are both in English and Chinese. The Chinese was written first and the translation probably says less than the Chinese. But the Chinese text takes up less surface space. Each character packs a punch. Once the initial investment is made to learn Chinese the precision and conciseness are it's benefits.



In this one the Chinese and Japanese explanation take half the space of the English explanation.


4 comments:

Paula said...

You've made your case well on the complexity of learning Chinese. It sounds very difficult. Would it be as hard if you just tried to learn the spoken word without the written?

Florence said...

You know Chinese more Than I know. I know speaking and writing by growing up there. But I have no idea of the theory behind. You know the theory much better than any of the people I know in Taiwan.

Pinfan said...

Chinese is hard, so, it will be so cool if you can learn well. :>

Unknown said...

An accurate count of Chinese sounds is 389 and with the 4 tones we have 1476 homonyms to be distributed amoung the
approximate 50,000 characters