More Than You Want To Know
About Simplified Characters



Everyone has heard of "simplified characters," the Chinese characters used to write Chinese today, as contrasted with the "traditional characters" used in earlier periods.

Beginning early in the XXth century, Chinese linguists developed schemes for the simplification of the writing system so as to promote popular literacy. Simplified characters are the fruit of these efforts.

The system of "simplified characters," that is to say, the corpus of both modified and unmodified characters that makes up the standard writing system of China today, was first promulgated in a tentative draft in 1956. For the next decade or so, much publishing made use of mixed simplified and traditional type fonts that conformed to neither the new standard nor traditional usage. That transitional stage did not outlive the destruction of the publishing industry during the Cultural Revolution.

Today virtually all publishing conforms to the new standard except when the intended audience is Chinese outside of China.

Even though many characters were not changed, the total official writing system is usually called "simplified characters" (jiǎntǐzì 簡體字 / 简体字). The contrasting name for the traditional character set is usually "traditional" in English, or in Chinese "complex" (fántǐzì 繁體字 / 繁体字). (In Taiwan the traditional set is sometimes called "proper characters" —zhèngtǐzì 正體字 / 正体字— as a slap at the mainland standard.)

Outside of the People's Republic, simplified characters were initially embraced only in Singapore. Elsewhere Chinese ignored them as an unnecessary or resisted them as objectionable. In Taiwan they were directly prohibited. Today they are rapidly moving into use throughout the world, except in Taiwan. In Taiwan occasional mainland books in simplified characters have been imported since the end of martial law in 1987, and such Taiwanese products as computerized dictionaries, must include simplified characters to be salable.

Given that nearly all Chinese speakers in the modern world now use simplified characters in nearly all written communications, it is extremely unlikely that any area, including Taiwan, will be using only traditional characters a century from now.

In this article I will try to give you some idea of the nature of the simplification scheme and of the principles by which decisions seem to have been made about the official forms of characters. I will also try to point to some of the problems that the reformers faced, some of their approaches and compromises, and some of the challenges that the simplification did not confront.

In this text, simplified characters are printed in red. Traditional characters which have been replaced by simplifications are printed in blue. And Characters which are shared by both simplified and traditional orthographies are printed in black. (Many characters are traditional but also serve as simplifications of others, so they change color depending on the context.)

An important take-away message is that there is not a one-to-one correspondence between simplified and traditional characters, and that any procedure (or computer program) that "converts" between the two systems is destined to make mistakes if it does not take account of context. For example, hòu "after" and hòu "queen" are both now written . Converting from the simplified back to the traditional form requires knowing which traditional form — or — is called for.

For purposes of the present discussion, "China" obviously excludes Taiwan in most cases because of the difference in language policy.

This essay is based entirely on my own observations. If you want to know more, the best published work I have seen on this subject, taking account of unofficial simplifications and including distribution maps for different character variants, is:

Long story of short forms: the evolution of simplified Chinese characters. Stolkholm: Department of Oriental Languages, Stockholm University. ISBN: 91-628-6832-2.

Click here for a general essay about Chinese languages in general, including an introduction to how Chinese is written.

Principle 1: Simplifying Commonly Occurring Parts of Characters

Because Chinese characters are often made up of parts, one important principle of simplification is to simplify frequently occurring parts in all contexts where they occur. For example, yán refers to speech. Although it occurs as in independent word, it also is a graphic element appearing in a large number of other characters, in nearly all of which the simplified variants reduce it to two strokes (as has been done for hundreds of years in various styles of calligraphy):

zhào = = edict
píng = = comment
shī = = poem
jiè = = admonish
= = speech
shuō = = say
huà = = speech
dìng = = to agree, order

Similarly the element wéi occurs in many words of similar sound, and is consistently simplified to :

huì = = to taboo
wéi = = disobey
wéi = = enclose
wěi = = weft

Some characters undergo simplification of more than one element by the application of such rules. Thus huì , "to taboo," which is made up of exactly the two elements we have been discussing — yán and wéi — simplifies both of them: .

(There remain occasional exceptions. For example, the simplification of to two strokes does not occur in kuā , "boast," because the character has always had an alternative writing, , which is simpler yet, and therefore was simply officialized as the simplified form. Similarly is not simplified in shì , "oath," which remains unchanged, presumably because the two stroke simplification struck planners as counter-aesthetic when it was on the bottom of the character.)

The consistent simplification of elements that frequently occur as parts of characters immediately generated thousands of simplifications, constituting the vast majority of all simplified characters.

Principle 2: Replacing Whole Characters

A second principle involved simply replacing whole characters. In some cases, the changes were relatively minor, often a matter of officializing a common shorthand form or regional writing. For example, tú "graphic" was simplified to , which was formerly regarded as a kind of calligraphic shorthand.

Wèi "guard" was simplified to , quite a different character, but one that had served in south coastal China as a shorthand for for many generations. Notice, by the way, that this character could have been simplified by changing the part to . Probably that was not done because was available and was yet simpler.

(There are actually two theories about the possible origin of this southeast coastal simplification, neither particularly convincing. One holds that is was derived from the top central portion of the full version. The other, even less plausible, is that it is derived form the Japanese WE or U . [李乐毅 1996 简化字源。北京:华语教学出版社。 P. 248.])

Huá , usually translated "flowery," most often refers to China, and occurs in a great many expressions referring to China, including the official name of the country. (The same character is used for the surname Huà.) It is simplified to , probably inspired by the use of huà "transformation" as a sound element in many other words, combined with the cross already on the bottom of the original .

The character nóng has been normal for a long time to refer to agriculture, but it has been only one of a great many competitors in popular usage. The simplest of these was selected as the new official form. To see some of the many other forms that in theory could have been officialized, click here for a web page with forty four different ways in which  /  has been written over the years. (It is one page from a fascinating dictionary of unofficial alternative writings sponsored by the Mandarin Promotion Council, Ministry of Education, Republic of China. You may need to set the text encoding of your browser manually for "Chinese Traditional Big5-HKSCS" for all characters to display correctly.)

Not all simplifications were made that could have been made. For example the character fó "buddha" is now written in Japan. The Japanese simplification derives from a very old Chinese usage — indeed we have it in the calligraphy of the founder of the Táng dynasty himself. But that abbreviation has been historically rare in China, and "buddha" remains in official Chinese today. (Japan had its own, more modest simplification program earlier in the XXth century, but that is a different story. In most cases, the simplified characters chosen in China and in Japan differ from each other.)

Characters Within Characters.

Although only a small number of simplified parts of characters (like yán ) are consistently carried across the writing system, many of the whole-character simplifications are also picked up in additional simplifications when they occur as parts of other words. For example the following words are all written with huá  / :

huá = = noise
huà = birch
huá = = a red horse (both elements simplified)
huá = = plowshare (both elements simplified)

This kind of across-the-board simplification includes some characters rare enough that, although the rules generate logical simplifications, neither type fonts nor computer codes include them yet. (There is no Unicode computer code for a simplified version of wěi , "flourishing," for example, although it is easy to see what the parts would simplify to, and although the simplified character does occur in Chinese dictionaries.)

However, in other cases, although a complex character was simplified, and although it occurs as a part of other characters, the other characters remain unsimplified. For example, the top portion of xīng , "felicitous" is reduced to three jots in its simplification . However that same element is left unchanged in the character cuàn "to cook separately," which continues to be written , at least officially. (Some people cheat and write three jots there too.)

Similarly, the word qiān is used for signatures and also for bamboo slips used for divination in temples. In the second sense, it is often written . In both senses it has long been written. Today both forms are officially simplified to . However the word chèn, "prophetic texts," was written by adding the "speech radical" to , producing . The simplified version, for obscure reasons, keeps the unsimplified form of the right-hand side and simplifies only the speech radical part: .

In other words, there are some elements that are always simplified, at least in certain positions. But there are also simplifications that are not propagated consistently, even though they could have been. (Some of these unofficialized possibilities occur as quick forms in modern handwriting.)

Removing Elements.

Just as some elements are simplified only in some contexts, some graphic elements are removed when they occur in some characters, but not when they occur in others. An excellent example is the element biào 髟, which refers to long hair and was an element in a great many other characters, most of them relating in some way to hair. (An example is zōng , the mane of a horse.)

In some characters containing the element biào, biào had been added in recent centuries, and earliest Chinese had used the other part of the character standing alone. An example is hú "beard, barbarian." In antiquity one character () served for both meanings (for reasons we can easily imagine). Later "beard" was differentiated by the addition of the biào element to show that hú referred to hair: . "Barbarians" continued to be written with . Today once again both forms are united in the single official character .

The same story applies to xū, meaning both "must" and "moustache" . Both are now written with the character , a simplification of .

At first glance, it is hard to be enthusiastic about such mergers, since they potentially reduce clarity, but the fact is that very little confusion is caused by them. Just as they worked in antiquity, they work today.

Ancient precedent was not necessary to make a change, of course. The biào was sometimes omitted without obvious precedent. This also produced characters with more than one unrelated meaning, and sometimes even different pronunciations.

Full FormSimplified Form
sōng = loose sōng = loose (髟 removed)
sōng = pine sōng = pine (now also means loose)

(In the case of the word fǎ "hair," the element remaining when the biào was stripped from it, , was elevated not only to stand for hair, but also to write the unrelated word fā , "to make," so today both fǎ and fā are written . We shall meet more examples of the same process below.)

Finally, some characters containing biào were simply left unmodified, or were modified only because the other element happened to be simplified and that simplification was carried across.

Full FormSimplified Form
zōng = mane, bristles zōng = mane, bristles ( not removed)
zōng = ancestor zōng = ancestor ( not simplified to this)
bìn = sideburn bìn = sideburn ( not removed)
bīn = guest bīn = guest ( not simplified to this)
máo = bangs máo = bangs ( not removed)
máo = a hair máo = a hair ( not simplified to this)

Principle 3: Other Reductions

Duplicate Character Consolidation.

Sometimes over the course of history a single word has come to be written in more than one way. We saw above that kuā , "boast" had always had the alternative writing , which is now the only official form. The word for "cup," bēi, was written either or . In the simplification scheme, these were reduced to a single character: . The characters and were used interchangeably for cóng, "crowded," and were both consolidated into . That is not as odd as it seems, since is made simply by drawing a line under , a shorthand form of "from" (also pronounced cóng;). It's not new. The simplification has been appearing in shorthand for centuries, beginning in the Hàn dynasty, about 2000 years ago.

How did the original duplicate writings come into existence? One way is the popularization of a "correction." For example, kuā means "boast," which is a kind of speech. Nearly all characters involving speech make use of the element yán , so it is not surprising that somebody sooner or later would write as . But that didn't eliminate the original form. This pressure to logical consistency has been with the system from the beginning, just like the drag of tradition.

Another example is the word wèi  /  that we met earlier. It refers to guards and guarding. A person can be a wèi. But the syllable is also used as a verb, so "sanitation" is literally "guard life" (wèishēng 卫生). Should wèi be written with a single character in recognition of the general semantic similarity, or should it have a different form for the noun and the verb?

In short, over the centuries, countless new characters have been created in recognition of such differences —a qiān is not really identical with a in all instances— and at the same time countless reforms (or shifts of popular usage) have removed characters that seemed to involve unnecessary distinctions —both and are now .

What's a duplicate? For most users, it is not always clear why two characters of very similar meaning are written differently. For example, fù means "to repeat or to return" and was written . On the other hand fù can also mean "double" and was written . Both writings appeared in compounds in the sense of "again":

fù xīn "repeat + new" = to renew
fù shēng "repeat + live" = to live again
fù yuán "repeat + members" = demobilize
fù xìng "double + surname" = two-character surname
fù zá 複雜 = "double + miscellaneous" = complicated
fù yìn = "double + print" = to photocopy

was also used as an alternative writing for "to reply, to cover, to overturn" in the very special case when it was used for "to reply" as in fù xìn 覆信 = "reply to a letter." All three characters had slightly different pronunciations in antiquity (and in some southern Chinese languages like Cantonese). Nevertheless the reformers decided that there was sufficient similarity between and that life would be made easier by merging them into a unified character: . (Fù was retained as a separate character except in those phrases for which people had been tending to write anyway, where of course it became .)

Merging Unrelated Characters.

We have already seen some examples of unrelated characters being lumped together (as in fǎ "hair" and fā "make" both becoming .)

Merging was a fairly general process, and throughout the simplification one finds a single simplification doing service for two different semantic fields, two different "words." Most often it occurs when they have the same sound. We saw that hòu meaning "behind" was traditionally written , but in the simplification scheme it was merged with hòu meaning "empress," so both were written . This works when there are no contexts in which confusion is likely (and indeed this example is found in the Confucian canon itself). Similarly gǔ meaning "grain" was lumped with , "valley," and both are now written . (Gǔ also meant "mulberry," and is retained for that specific meaning.)

(Unfortunately all four of these original characters — hòu , hòu , gǔ , and gǔ — also serve as family names, and lumping four characters into two has the effect of implying that four ancient descent lines are actually two ancient descent lines. In most parts of China marrying someone of the same surname is considered immoral. and , although pronounced identically in Mandarin, used to be two different family names, but now they are all written . Can they no longer marry each other? In other words, can the character reform have resulted in some marriages being popularly prohibited now that were allowed in the past? Possibly so, but I know of no evidence of this so far.)

Popular Shorthand.

Sometimes a popular simplification grew up for a character and became quite general over a limited region, or as a popular shorthand. The term for "shrimp" (xiā) was usually written , but I have seen Chinese waiters scribble down the much simpler character xià , meaning "beneath," which was enough, given the similarity of sound, for the kitchen staff to figure that shrimp must be involved.

"Beneath" is not a very appropriate substitute for "shrimp" in all contexts, of course. The compound "sea shrimp" is confused with "beneath the sea" that way. But some popular shorthand forms did have potential for general use. Therefore sometimes a character was replaced by a popular simplification that had been used more or less universally for many centuries. The simplified form for shrimp is xiā , midway between the former official version and the waiter's shorthand , and it is a very old simplification.

Another example of the officialization of a popular shorthand character is wéi , which has been written since at least the Sòng dynasty (period 15).

Principle 4: Homographs Are Left Unresolved

Chinese has always had a few characters that did double service, standing for related words that varied in meaning and pronunciation (typically tone). In the simplest case, these are at least related. For example wáng means king, but wàng means "to reign over." Both have always been written with the same character . (It looks as though very ancient Chinese may have had a tendency for related nouns and verbs to contrast in tone, a process that is only faintly visible in later periods.)

However some words of similar pronunciation have shared a character even when they have been utterly unrelated in meaning. For example guān means "to see," and guàn means a hermitage, but they have always been written with the same character ().

Finally, a few words related neither in meaning nor in sound have been written with identical characters: For example, nǚ means female, but the same character was often used to write the pronoun rǔ, meaning "you" (and usually referring to a male). Although rǔ eventually developed the form , accurate reproductions of old texts often print , and any modern dictionary, even in simplified characters, still gives rǔ as one of the pronunciations of , and lists as one of the ways to write rǔ.

Similarly, shí means "stone" and a dàn is an old-fashioned measure for grain, but both have always been written . (Well, to be honest, dàn was once pronounced shí.)

Different words sharing the same written character are referred to as "homographs," and non-Chinese studying the language tend to think of them as "one character with two pronunciations" rather than (correctly) as "two words written with one character."

The simplification scheme did not try to address the problem of homographs for the most part. Both guān and guàn, the words formerly written , have been simplified to . No attempt was made to provide separate characters for them. The simplification project was an historical moment when this could have been undertaken, but it was decided not to do so.

In a few cases the simplification scheme actually increased the number of homographs. For example huá "to row" and huà "to plan" were both changed to . That is one character less to learn, and a lot fewer strokes to write, but it is one character more with two different spoken syllables associated with it. Similarly shè 舍 "house" and shě "abandon" are now both written , which now has two pronunciations where before it had but one.

In practice there is little probability of confusion, but each of these "simplified" characters, and , while reducing two characters to one (the one with fewer strokes), also add to the complexity of written Chinese by increasing the number of homographs, obliging the reader to analyze the context in order to choose which referent is intended.

It is not quite universally true that different semantic fields attached to the same character were left together after simplification. For example, the character traditionally represented two different words: qián "male, father" and gān "clean, dry exhausted."

The sense of gān "dry" was consolidated with , also pronounced gān, which already had two unrelated meanings "to be concerned with" and "heavenly stem" (part of a system of counting). The character is therefore now used only in the reading qián, and has an expanded range of meanings.

Unfortunately, an unrelated character gàn (sometimes written ) "to manage" was also assimilated to . Therefore now has not only its original pronunciation and meaning, but also a second meaning inherited from plus yet another pronunciation and meaning inherited from .

Loose Ends.

It is should be clear by now that some decisions could reasonably have been made differently. A couple of cases slipped through that seem simply to have been bad judgment or infelicitous committee compromise. For example, the word lá, "to slit," was written before simplification, and it still can be. However it was also assimilated to the symbol , which is normally used for lā, meaning "to lead or drag." Thus the character was both simplified and not, with no difference in meaning, and was also confounded with a character with a different pronunciation.

Simplified characters occasionally are not the shortest form among the variants that were consolidated. For example, bī "to force" was traditionally written or . The slightly longer first form was officialized: . The same thing happened to kuǎn "stipulation": of the two traditional forms and , it is the more complex (by one stroke) that was officialized: .

In the same way, zhài "stockade" was traditionally written or (and some people considered them different words). They are both now officially written with the more complex character . (However the Chinese computer standard still retains for those who think they are different.)

Similarly, diāo "to carve" was written both and , which were interchangeable. Only the more complex survives (and has absorbed the homonymous word "vulture" as well — perhaps because the right-hand elements niǎo and zhuī both originally represented birds anyway).

Popular Reaction 1: Enthusiasm

Popular reaction to character simplification was mixed. Perhaps the widest reaction was general enthusiasm for changes that in fact made the learning of characters easier, their writing faster, and their printed forms more legible.

Indeed, one reaction was to join the game and create additional simplifications, not part of the official scheme. In a short time this process got quite out of hand, and innovative and ingenious, if not always intelligible, simplifications were turning up all across China. But free innovation was ultimately limited by the absence of such forms in standard type fonts, so that innovative vernacular characters could be used in handwriting, but could not easily be printed. And of course they couldn't be found in dictionaries.

Adoption of the official scheme did not end the existence of the committee charged to create it, and while the public was getting used to what was and was not official, the committee continued its work. In the early 1980s it proposed a list of further simplifications. For example, the shorthand character , borrowing its approximate sound from the simple character dāo "knife" was proposed as a replacement for dào "way" (as in Daoism). The widespread reaction to the new list of changes was revulsion at the thought of having to go through wrenching changes of script all over again, and the entire new list was rejected. (Dào is still written , at least officially.)

Not surprisingly, a few unorthodox additions have actually achieved general currency. One of the most common occurs in the word for restaurant. The traditional term was cāntíng , in which both characters were quite complex. In simplified characters this is , with only the second of the characters (tíng) changed. In popular usage, however, the cān tends to be written using only the upper left-hand corner , which many people feel provides better visual balance with the , and hand painted signs are very frequently written 歺厅. The character is a very ancient variant of dǎi , "evil," but it is not in active use except as a part in other characters. For this reason, its popular recycling to stand for cān creates no confusion and is probably destined to be officialized some day. (Meanwhile, it sustains a threadbare but educated foreigner's joke to refer to a greasy spoon as a "dǎitíng" .)

Another example of "recycling" of rare but simple characters as informal simplifications of standard ones is dīng , used only in a couple of rare fixed expressions. Today is a common "wrong" simplified writing of tíng , to "stop" or to "park" a car (tíngchē 停车 = 仃车).

Popular Reaction 2: Resistance

Another reaction to the simplification reform was resistance. Originally, that was probably almost as common a reaction as acceptance. Not everybody named Liú or Yè was very happy about being told to write his name or . People with the name Hòu had no particular desire to be lumped together with people named Hòu . Who could love the conversion of the beautiful character huá , which referred both to a flower and to China, into the prosaic character , made up of a phonetic huà () with a cross under it? And what is one to make of reformers taking the heart () out of the middle of the word for love, ài  / ?! The change, as expected, was jarring.

Further the objection could easily be made that severe changes in the writing system would cut China off from the immense treasure of its ancient literature. Although simplifications like huá  / , cóng  / , wèi  /  and xīng  /  represent an enormous improvement in ease of writing, ease of learning, and clarity of printing, it is also clear that very many such wrenching changes could potentially render modern Chinese so different from earlier Chinese that it would become, for practical purposes, a different written language.

Fortunately, there are comparatively few such complete transformations. Far more changes are of the kind that lump together different words under a single writing (like  /  both becoming shè), or that owe their simplification to the removal of part of a character (like tīng becoming ), or where the reduced element is (nearly) always simplified when it occurs as part of something else (like the left side of shuō "speak"  /  and huà "speech"  / ).

Thus the person who learns to read simplified characters is not IN FACT confronted with a totally foreign system when reading older Chinese. For someone raised on simplified characters, reading traditional ones is a bit of a strain at first, and a minor annoyance for a long time, but it is NOT impossible — hú may be a mess, but it is easy to see the in it, and it is hard to misinterpret given adequate context.

If there is a comparison to be made with English, perhaps someone who is used to simplified Chinese can read traditional Chinese with about the same comfort as a person used to modern English can read Shakespeare, or at worst, Chaucer. It's a bit of a stretch, but it is by no means impossible.

The various inconsistencies (and many more seeming inconsistencies) in the system were easily seized upon as the basis for heated objections to the whole project. As one elderly Chinese friend of mine commented, "nearly every character in the scheme has something the matter with it." (We have seen some of these in our exploration of the inconsistent removal of biào .)

However the scheme was enforced by a government that controlled all printing and publishing, as well as by the school system. Those who objected to the forms taken by individual characters were listened to when the scheme was being developed and during a period of public discussion afterward. But once it was finalized, the story came to an end. Those who prefer traditional characters have remained free to write them if they please, but find little audience, and no official sympathy.

The system works. Simplified characters have been in official use for over half a century, everyone is accustomed to them, and all but the oldest people in China find it easier to read and write simplified characters than traditional ones. Simplified characters — THESE simplified characters — are surely here to stay.

Just as in Europe there is still a residual use of Roman numerals and Latin gravestones, so in China there is considered to be a certain elegance about the traditional characters. They are associated with tradition, with the status of the educated elite, and with internationalism, but few would care to read a whole book in traditional characters any more, and fewer still feel competent to write continuous text in them, any more than most Americans can correctly differentiate "thee" from "thou" or easily add XLIII to LXXVII to get CXX.

Because of the lingering sense of elegance, there is extensive use of traditional characters — sometimes incongruously mixed with simplified ones — for book titles, on shop signs, and in other places where they are thought to be more decorative. As time passes we can probably expect this nostalgia to fade. Just as Shakespeare's spelling is retained only in antiquarian editions of his works, traditional characters will be retained only in antiquarian reprints of old Chinese texts.

In Taiwan and Hong Kong, traditional characters are still the universal norm, although the retrocession of Hong Kong to China has increased the probability of simplified characters soon becoming more widespread. The vast majority of Chinese web sites are in simplified characters. Overseas Chinese communities vary, with Singapore fully committed to simplified characters, and most other regions still using traditional characters. (My local utility company in California finally changed from traditional to simplified characters only in 2014.) The continued use of traditional characters outside of China means that some publishing intended for consumption in those areas (and virtually all of the incredibly prolific publishing that takes place in Taiwan) is still done in traditional characters.

Meanwhile anyone using both systems is in the position of having to make conversions. As we have seen, simplified and traditional characters are not in a perfect one-to-one relationship to each other, so conversion, especially computer-assisted conversion, and especially conversion from simplified to traditional characters, introduces mistakes that must be removed by a human eye sensitive to the larger context of the text. Overlooked mechanical errors in conversion are the depressingly frequent "typographical" errors of modern International Chinese.

