Today, speakers of Chinese use three numeral systems: There is the ubiquitous system of arabic digits and two ancient Chinese numeral systems. The "Hua1 Ma3 (花碼 U+82B1, U+78BC for flowery or fancy numbers)" system and the character writing system become, however, gradually supplanted by the Arabic system. The "Hua1 Ma3" system is still in used in Chinese markets (e.g. in Hong Kong). The character writing system is still in use when writing number in long form such as on checks.
Individual Chinese characters mentioned in this article can be looked up graphically in the Unihan database by using the following access URL
The Chinese character numeral system is not a positional system. Instead, it is based on decimal bundling. The rules for forming numbers are quite simple:
- The numeral characters are tightly integrated into the language: Each numeral character has a phonetic value and a number is read by pronouncing each individual character it consists of, unlike e.g. English, where the numeral '2' has to be pronounced 'two' or 'twenty' depending on position.
- There are ten 'basic' numeral characters representing the numbers zero through nine. And there are other characters representing big numbers such as tens, hundreds, thousands etc. There are two sets of characters for Chinese numerals, one in formal writing and one in simple daily use writing. The formal version is much more complex to prevent alteration in legal documents such as promissory notes.
Their phonetic values in Mandarin are:
(with Unicode notation U+xxxx for formal and simple writing respectively)
- 'ling2' ('0') (零 U+96F6)
- 'yi1' ('1') (壹 U+58F9), (一 U+4E00)
- 'er2' ('2') (貳 U+8CB3), (二 U+4E8C)
- 'san1' ('3') (參 U+53C3), (三 U+4E09)
- 'si1' ('4') (肆 U+8086), (四 U+56DB)
- 'wu3' ('5') (伍 U+4F0D), (五 U+4E94)
- 'liu4' ('6') (陸 U+9678), (六 U+516D)
- 'qi1' ('7') (柒 U+67D2), (七 U+4E03)
- 'ba1' ('8') (捌 U+634C), (八 U+516B)
- 'jiu3' ('9') (玖 U+7396), (九 U+4E5D)
- 'shi2' ('10' or ten) (拾 U+62FE), (十 U+5341)
- 'bai3' ('100' or hundred) (佰 U+4F70), (百 U+767E)
- 'qian1' ('1000' or thousand) (仟 U+4EDF), (千 U+5343)
- 'wan4' ('1 0000' or myriad) (萬 U+842C), (万 U+4E07) *
- 'jing1' ('1000 0000' or 10 million) (京 U+4EAC) Ancient Chinese *
- 'yi4' ('1 0000 0000' or hundred million) (億 U+5104) *
- 'gai1' ('1 0000 0000' or hundred million) (垓 U+5793) Ancient Chinese *
- 'zi3' ('10 0000 0000' or thousand million or American billion) (秭 U+79ED) Ancient Chinese *
- 'zhao4' ('1 0000 0000 0000' or British billion or American trillion) (兆 U+5146) *
- 'fen1' (tenth) (分 U+5206)
- 'hao2' (hundredth) (毫 U+6BEB)
- 'li2' (thousandth) (釐 U+91D0)
- Leading '1' can sometimes be abbreviated when it is understood. The numbers 11 - 19 are often written using two characters, where the first one is the basic numeral '10' and the second one is one of the basic numerals '1' to '9'. (i.e. 14 is written as '10' '4' as an abbreviation from '1' '10' '4'.) The leading '1' in other positions can be abbreviated only in conversation (common in Cantonese). For example, 17000 can be read as '10000' '7', but written as '1' '10000' '7' '1000'. However, when more than two digits are involved, the abbrevation usually does not take place. For example, 114 is read as '1' '100' '1' '10' '4', and definitely not '100' '10' '4'. Although '1' '100' '10' '4' is marginally acceptable, it is not common.
- The numbers 20, 30, 40 ... 90 are constructed using a multiplicative principle, where, e.g., 60 is represented as '6' '10'; the numbers in between are formed like 11-19, so that, e.g., 42 is written as '4' '10' '2'.
- There are also numeral characters for hundred (bai3), thousand (qian1), myraid (wan4) and hundred million (yi4) and trillion (zhao4).
The above principles are extended, except a new grouping character is introduced for each myraid (wan4) times of the previous number.
For example, one yi4 = 10000 wan4; one zhao4 = 10000 yi4.
Hence it is more convenient to read if the digits are separated four in a group.
For example, 12,345,678,901,203 is regrouped as 12,3456,7890,1203 to read or write as
shi2 er2 zhao4 san1 qian1 si1 bai3 wu3 shi2 liu4 yi4 qi7 qian1 ba1 bai3 jiu3 shi2 wan4 yi1 qian1 er2 bai3 ling2 san1.
(十二兆三千四百五十六億七千八百九十萬一千二百零三) which is equivalent to say (*) ten 2 trillion 3 thousand 4 hundred 5 ten 6 yi4 7 thousand 8 hundred 9 ten (*) myriad 1 thousand 2 hundred 0 3. (*) denotes where a character is understood and omitted. This may seem very complicated, but it actually is very similar to reading an English number. The only differences are that myriad is used as a grouping unit instead of the usual thousand, and ten is written explicitly instead of appending the suffix ty or teen to the number. Compare to a grouping of three digits in the English system, 12,345,678,901,203 is read as 12 trillion 3 hundred 4ty 5 billion 6 hundred 7ty 8 million 9 hundred 'and' 1 thousand 2 hundred 'oh' 3.
- 'Interior zeroes' before the unit position (as in 10002) have to be spelt explicitly, so 10002 becomes '1' '10000' '0' '2'; the reason for this is that '1' '10000' '2' is used as a shorthand for '1' '10000' '2' '1000'. One '0' is sufficient to resolve the ambiguity. Same rule applies to the unit position before each grouping character. For example, 10050000 is read '1' '1000' '0' '5' '10000'. However, 1032 can be read as '1' '1000' '0' '3' '10' '2'. In this case, the '0' is preferred but optional because the '3' '10' '2' is not ambigous -- oh, and try to avoid the use of '2' '100' '5' (i.e. 250) in conversational language; it is normally used to mean stupid
- That's it! Easy, isn't it? Compare the Chinese way of saying '94' to the French one...
Strictly speaking, the Chinese written numbers should not be considered a numeral system. As an analogy, when the value 3000 is written as two English words "Three Thousand", the English words are not part of the number system. (or are they?)
China is a 4000+ year old civilization, it is unreasonable to assume that ancient Chinese used this lengthy format to express numbers before the Arabic numerals were introduced to China. The ancient Chinese mathematicians had done some great work with the abacus, e.g. computation of the value of Pi and the prediction of comets for the emperors. They would have a hard time if they only had the lengthy character numeral system. The long written form was/is almost unusable for doing mathematics or commerce. These fields were where the "Hua1 Ma3" system became useful. In fact, this number system shows a very strong tie with the use of the abacus. For instance, the numeric symbols for 1, 2, 3, 6, 7 and 8 are represented in a similar way as on the abacus.
Nowadays, the "Hua1 Ma3" system is only used for displaying prices in Chinese markets or on traditional handwritten invoices. According to the Unicode standard version 3.0, these characters are called Hangzhou style numerals. This indicates that it is not used only by Cantonese in Hong Kong.
In the "Hua1 Ma3" system, special symbols are used for digits instead of the Chinese characters. The digits are positional. The numerical value is written in two rows. The top row contains the numeric symbols, for example, XO||= or XO=|| stands for 4022. The bottom row consists of one or more Chinese characters. The first indicates the order of the first digit in the top row, e.g. qian1 for thousand, bai3 for hundred, shi2 for ten, blank for one etc. The second character denotes the unit, such as yuan2 (元 U+5143 for dollar) or mao2 (毛 U+6BDB for 10 cents) or sian1 (仙 U+4ED9 for 1 cent) or li2 (for Chinese mile). If the characters 'shi2' 'yuan2' (拾元) are below the digits XO||=, it is then read as forty dollar and twenty two cents. Notice the decimal point is implicit when the first digit '4' is set at the 'ten' position.
The "Hua1 Ma3" system in Hong Kong is definitely using the same Hangzhou numerals symbols. However, it is unsure if the stacked arrangement is also the same in the Hangzhou system. Wikis from other parts of China please confirm if the "Hua1 Ma3" system is the same as Hangzhou system.
The digits of the Hangzhou numerals are defined between U+3021 and U+3029 in Unicode. Zero is represented by a circle, probably numeral '0', letter 'O' or character U+3007 may work well. Leading and trailing zeros are unnecessary in this system. Additional characters representing 10, 20 and 30 are encoded as U+3038, U+3039, U+303A respectively.
For those who cannot see the Unicode glyphs, here are the descriptions of the appearance of these digits:
- 0 is a circle (exact Unicode unknown, perhaps 〇 U+3007)
- 1 is one horizontal (一 U+4E00) or vertical (〡 U+3021) stroke
- 2 is two horizontal (二 U+4E8C) or vertical (〢 U+3022) strokes
- 3 is three horizontal (三 U+4E09) or vertical (〣 U+3023) strokes
- 4 is a cross that look like X (〤 U+3024)
- 5 is a loop (〥 U+3025)
- 6 is a dot (signify 5 the same way as on an abacus) on top of one horizontal stroke (〦 U+3026)
- 7 is a dot on top of two horizontal strokes (〧 U+3027)
- 8 is a dot on top of three horizontal strokes (〨 U+3028)
- 9 is a symbol (〩 U+3029) looks like the Chinese character for "jiu3 (久 U+4E45)", compare to the formal character '9' "jiu3 (玖 U+7396)". (Some web browsers, e.g. IE 5.5, display this character incorrectly, click here to see the correct graphic glyph.)
The digits 1 to 3 come in the vertical and horizontal version so that they can alternate if these digits are next to each others. The first digit usually use the vertical version. e.g. 21 is written as ||- instead of || | which can be confused with 3.