Markup language

HomePage | Recent changes | View source | Discuss this page | Page history | Log in |

Printable version | Privacy policy

A markup language is a kind of text encoding that represents text as well as details about the structure and appearance of the text. The name is derived from the traditional publishing practice of "marking up" a manuscript, that is, adding printer's instructions in the margins of a paper manuscript. Markup languages are used, for example, by the publishing industry to communicate printed works among authors, editors, and printers.

The most common markup languages in use today are SGML and its derivatives, such as HTML and DocBook. These languages intermix the text of a document with markup instructions in the same data stream or file. Here, for example, is a small section of text marked up in HTML:

   <h1> Anatidae </h1>
   <p> The family <i>Anatidae</i> includes ducks, geese, and swans,
   but <em>not</em> the closely-related screamers.

The codes enclosed in angle-brackets <like this> are markup instructions, while the text between these instructions is the actual text of the document. The codes "h1" and "em" are examples of structural markup, in that they describe the intended purpose or meaning of the text they include (specifically, "h1" means "this is a first-level heading" and "em" means "this is an emphasized word"). A device reading such structural markup may apply its own rules or styles for presenting it, using larger type, boldface, indentation, or whatever style it prefers. The "i" instruction is an example of presentational markup. It specifies the exact appearance of the text (in this case, the use of an italic typeface) without specifying its purpose.

See also: SMIL