The Chinese Language is a member of the Sino-Tibetan family of languages. It is related to Tibetan and Burmese. It is not related not at all to Korean,Vietnamese,Thai or Japanese, though these languages (like other Asian languages) were strongly influenced by Chinese in the course of history. Korean and Japanese both have writing systems which contain Chinese characters. Along with those two languages, Vietnamese contain many Chinese loanwords.

The notion of a "Chinese Language" may seem at first to be a fiction. The term "Chinese" is employed to the classical written language known as "wen yen" which was used by Confucius and as well as the modern standard known as "bai hua". It includes many different spoken variations which are may be mutually unintelligible. The spoken language of Beijing is for example very different than the conversational language of Hong Kong.

Nevertheless, there are good reasons for using a collective name. The most important one is that Chinese themselves consider the language to be unified entity, and there are good reasons for treating it as such. The most important is that the distinctions between the different variations of Chinese are not very distinct. For example, in writing an informal love letter, one may use informal "bai hua." In writing a newspaper article, the language used is different and starts including aspects of "wen yan." In writing a cermonial document, one would use even more "wen yan." The language used in the cermonial document may be completely different than the love letter, but there is a socially accepted continuum that is exists between the two.

There are similar continuums in spoken language. A person living in Taiwan for example, would commonly mix pronounciations, phrases, and words from Mandarin and Min-nan, and these mixtures would be considered socially appropriate under many cirumstances. A person living in Hong Kong would use different combinations of Mandarin, collquial Cantonese, and written Cantonese depending on the social situation.

Another distinctive aspect of the Chinese language is the complex relationship between the various spoken varieties, and the various written varieties. Chinese is written using a "logographic" script in which one character represents one word element. It is generally the case that a Chinese text written in "bai hua" would be readable by most educated Chinese, but again the relationship between written and spoken Chinese is complicated. For example, an educated person in Hongkong would be able to write a text in written formal Cantonese which is readable by someone who is a Mandarin speaker. However, that written formal Cantonese while similar to written formal Mandarin would be very different from a word for word transcription of what the Cantonese speaker would speak and would also be different from written colloquial Cantonese. One might ask that if formal written formal Cantonese is different from spoken Cantonese, where does the reader learn written formal Cantonese and the answer is that they would learn it in school.

Hence one could say that the characters are what makes the Chinese language an entity. If some day an alphabetic system should supplant them, "the Chinese language" would cease to exist.

Spoken variations of modern Chinese

Linguists classify the variations in spoken Chinese into seven groups. Within these groups, there are many subgroups many of which are completely unintelligible. Also the amount of "linguistic consciousness" varies between the groups. For example, a speaker of Cantonese dialect living in Hong Kong tends to feel a great deal of common identity with a speaker of Cantonese living in Taishan, even though these two varieties of Cantonese may be almost unintelligible. By contrast, a Wu speaker in Hangzhou generally does not think of themselves as belonging to the same group as a Shanghaiese speaker in Shanghai even though they are linguistically similar. One can see this even in the naming. The Hong Kong and Taishan person would both claim to be speaking Cantonese in the first case, while in the second case only the person from Shanghai would be speaking Shanghaiese.

There are also great differences in the geographical variation of intelligibility. Mandarin dialects are remarkably constant with people living hundreds of kilometers from each other able to communicate intelligibly. In Fujian, people living ten kilometers away from each other can be speaking untelligible variations of Min.

One distinctive feature of Mandarin is the loss of tones in comparison to Middle Chinese and the other dialects. The result of this is that many words which are mono-syllabic in

other dialects are expressed as combinations of syllables in Mandarin.

The Chinese Written Language

The Chinese Writing system is logographic, i.e. each character expresses a word part. Originally, the characters were actually little pictures depicting what was meant. This, however, proved inconvenient (as you can imagine - try to depict "philosophy"!). There are still a number of characters which can be traced back to such pictorial characters, but many characters used today are compositions of other, more simple characters. Chinese scholars identify several types of compounds, including "meaning-meaning" compounds, in which each element of the character contributes to the meaning, and "sound-meaning" compounds, in which one component indicates the kind of concept the character describes, and the other hints at the pronunciation (though, as the spoken language has evolved since the characters were standardizes, these hints are often quite useless and sometimes directly misleading). For example, the character for "mother" ('ma', 1. pitch, in Mandarin) consists of one component meaning "female" and another one meaning "horse" - now this doesn't mean Chinese view mothers as female horses! The first component (or "radical") simply tells that the character denotes a female entity, whereas the second acts as a pronunciation guide by refering to the word for "horse", which is also pronounced 'ma', though in a different pitch.

Every character has a "radical", or most fundamental component, and this design principle is exploited by Chinese dictionaries: full characters are ordered according to their initial radical (for which there are only about 200 possibilities) and the number of strokes they consist of (a more detailed discussion of this can be found in the entry on ideographic writing systems).

Also, this principle is exploited by everybody learning to write Chinese: The vast number of Chinese characters can be memorized a lot easier, if they are mentally decomposed into their constituting radicals. The question, how many characters there are, is subject of a heated discussion: In the 18th century, European scholars claimed the total tally to be about 80,000. This number, however, is exaggerated: The most concise dictionary (the Kangxi Dictionary) lists about 40,000 characters. One reason for large number of characters is that they include all of the different characters in the different variations of Chinese. Popular estimates say, that about 3,000 characters are needed to read a Chinese newspaper, and 4,000 to 5,000 constitute a decent education.

Classification of characters

One can classify characters into character sets of which the following are in common use:

  • "bai hua"
  • "wen yan"
  • "written colloquial Cantonese" - Cantonese is unique in that is it has a commonly used written character system which is different from "bai hua" or "wen yan"
  • "dialectal characters"

Character forms

There are currently two standards for printed Chinese characters. One is the Traditional Writing System, used in Hongkong, Taiwan and by Overseas Chinese. The Peoples's Republic of China (also Singapore) uses the Simplified Writing System, which uses simplified forms for some of the more complicated characters. In addition most Chinese in writing letters will use some personal simplications for cursive.

The Chinese characters are also used to write the Chinese numerals.

Chinese grammar

All dialects share a similar grammatic system, which is different from the one employed by European languages: All words have only one grammatical form, neither conjugation nor declension nor a tense system exist. Concepts like "plural" or "past tense" have to be expressed in a syntactical way:

Tenses are indicated by adverbs of time ("yesterday", "later") and a number of particles indicating, e.g., completion of an action. Particles are also used to form questions: The syntax of questions is exactly the same as in declarative statements (basically, SUBJECT - VERB - OBJECT). Only the appended particle makes it a question. Plural meaning most of the time has to be inferred from context, since the Chinese language doesn't provide any lexical means of expressing this concept for most nouns(apart from giving exact numbers, which is, of course, possible).

Thus Chinese grammar is generally quite simple compared to that of the Romance languages. However, some subtle grammatical features which are unique to Chinese serve to enrich the grammar; for example, the notion of a "perfective" which signifies the degree to which a verb was completed.

Computer processing of Chinese

The computerized processing of Chinese characters involves some special issues both in input and character encoding schemes.

Chinese encoding systems

History of Chinese

Deciphering the history of Chinese poses an interesting problem. How do you know the pronounciation of a language which is not written phoentically. The effort that has been devoted at solving this problem is a testimony to the ingenuity of linguists.

Archaic Chinese

Much the work in reconstucting Archaic Chinese comes from the work of Bernard Kalgren whose work is based on the forms of the characters.

Middle Chinese

Linguists are confident in having a good reconstruction of which Middle Chinese sounded like. The evidence for the pronounciation of Middle Chinese comes from two sources: modern dialect variations and rhyming dictionaries.

Modern Chinese

The transition from "wen yen" to "bai hua"

The creation of a "national language"

Educating Mandarin

Character simplification

The Future of Chinese

Weblinks: Chinese to English dictionary and other resources presented in English; searchable by English meanings; Chinese text displayed as graphics (i.e. does not require any Chinese font).
Chinese to English Dictionary: searchable by English meanings; Chinese text in Big5 code (i.e. requires Chinese font).
Chinese Linguistics: Sites on Chinese linguistics (in English).
Chinese Characters Dictionary: supports Japanese, Korean, Cantonese, Hakka etc.
Cantonese Talking Syllabary: in Chinese; require Big5 font.