ISO 8859

From Wikipedia

HomePage | Recent changes | View source | Discuss this page | Page history | Log in |

Printable version | Disclaimers | Privacy policy

An ISO standard containing several 8-bit character encodings for use by computers.

The objective of ISO 8859 was to remedy the problem caused by the lack in ASCII of characters needed to express languages other than American English. However, more characters were needed to achieve this than could fit in a single 8-bit character encoding, so several were specified. All the encodings however encode the first 128 positions (from 0 to 127) in the same way as each other and the same way as ASCII. The upper 128 code points of each ISO 8859 encoding encode other characters not present in ASCII.

The ISO 8859 encodings provide the diacritic marks required for various European languages. They also provide non-Roman alphabets: Greek, Russian, Cyrillic, Hebrew and Arabic. However, the standard made no provision for East Asian languages such as Chinese or Japanese, as these highly ideographic writing systems would require many thousands of code points, many more than could put placed in a single 8 bit plane.

The encodings defined by ISO-8859 are:

  • ISO 8859-1 (aka Latin-1) -- most Western European languages
  • ISO 8859-8 -- Hebrew
  • ISO 8859-15 (aka Latin-15) -- updated version of 8859-1, adds Euro symbol and some French diacritics, while removing other characters to make room for them

Unicode supports millions more code points than ISO 8859 in a single wider character encoding using variable-length codes of 8-bit or 16-bit words; thus Unicode is often preferred for new applications. However, ISO 8859 has the advantage of being well-established, and simpler software is needed to manipulate it.