HomePage | Recent changes | View source | Discuss this page | Page history | Log in |

Printable version | Disclaimers | Privacy policy

In Computing, a string (or string of characters) is a data type used in most programming languages to represent text, and is the focus of this article.

The computing term string is also used in a broader sense to group a sequence of entities; for example, tokens in a language grammar, or a sequence of states in automata. See the theory of computation.

= Representations

A common representation is an array of characters. The length can be stored implicitly by using a special terminating character (often NUL) (the programming language C uses this convention), or explicitly (for example by treating the first byte of the string as its length, a convention used in Pascal).

Here is a NUL terminated string stored in a 10 byte buffer. NUL is the name for the character in ASCII which has the numeric value of zero.

x x x x x x x x x x
F R A N K 0 k f f w
x x x x x x x x x x

The above example is how "FRANK" would look in a 10 byte NUL terminated string. Characters after the 0 do not form part of the representation.

Of course, other representations are possible. Using trees and lists make it easier to insert characters in the middle of the string.

String Processing

Strings are such a useful datatype that several languages have been designed in order to make string processing applications easy to write. Examples include:

Many UNIX utilities perform simple string manipulations and can be used to easily program some powerful string processing algorithms. Files and finite streams may be viewed as strings.