Mathematics of the Western music scale

HomePage | Recent changes | View source | Discuss this page | Page history | Log in |

Printable version | Disclaimers | Privacy policy

Theory of Musical Scales

The variation of air pressure against the inner ear gives rise to the experience we call "sound". Most sound that people recognize as "musical" is dominated by periodic variations rather than random ones, and we call the transmission mechanism as a "sound wave". In a very simple case, listening to the sound of a "sine wave", the air pressure increases and decreases in a regular fashion, and we hear it as a very "pure" tone. Pure tones can be produced by tuning forks. The rate at which the air pressure varies governs the "pitch" of the tone, and can be measured in oscillations per second, or Hertz.

Whenever two different pitches are played at the same time, their sound waves interfere with each other - the highs and lows in the air pressure get mixed together to produce a different sound wave. As a result, any given sound wave can contain many different oscillation frequencies; the human hearing apparatus (composed of the ears and brain) can isolating these frequencies and hear them distinctly. When two notes are played, a single variation of air pressure at your ear "contains" the pitches of both voices, and your ear and brain isolates them into two distinct notes.

When the original sound sources are periodic, as most musical instruments tend to be, the interference between any two pitches may cause the listener to hear additional pitches which don't necessarily have a "musical" relationship to the originals. However, whenever one pitch is a simple integer multiple of the other (1, 2, 3, 4 times the oscillation frequency), the interfence does not generate any new pitches. Thus, any two pitches related like this sound perfectly "in tune" in that you hear those pitches, and nothing else.

Musicians call the trivial case of a 1:1 ratio a "unison." More interesting is the 2:1 ratio. Any two pitches with a 2:1 ratio between them define a difference in frequency (or "interval") that we call an "octave". This is the smallest interval at which two different pitches will be perceived by the listener as being "the same note", precisely because when played together, they sound perfectly "in tune". The average human ear can perceive tones from about 20Hz at the low end to around 20,000Hz at the high end. Starting at 20 and doubling up to 20,000 shows that the human ear has a range of about ten octaves.

There are clearly many other integer ratios, and even though they do not all avoid the generation of additional pitches, the hearing apparatus perceives any two notes with such a ratio (or close to it) to be "in tune".

We can now define a scale as a set of intervals between the lowest note in a set of notes within a given distance and each other note in the set. The distance and number of notes varies, but in the majority of the western classical and popular tradition, twelve notes span a single octave.

1:1	unison 
21:20	half step / minor second
9:8	whole step / major second
6:5	minor third
5:4	major third
4:3	perfect fourth
7:5	tritone / augmented fourth / diminished fifth
3:2	perfect fifth
8:5	minor sixth
5:3	major sixth
9:5	dominant seventh / flat seventh
17:9	major seventh / natural seventh
2:1	octave

For purposes of tuning we need a reference pitch, something all the instruments can agree on. Usually a 440hz sine wave is used as the reference pitch, as an A natural. Now, according to our table above, we can calculate the pitch of any other note by setting up a simple ratio relationship. For example, if I wanted to calculate the pitch of a perfect fifth from an A440, I would write:

   (X / 440) = (3 / 2)

and solve for X. Simple algebra, right? In the above, X comes to 660. Let's calculate two more:

   (X / 440) = (9 / 8); major second = 495
   (X / 440) = (5 / 4); major third = 550

The note that a scale centers around is called the tonic. We often use the term "key" for a scale, so the key of A is just a scale with A as the tonic.

Now, what actual pitches do we end up with? If we pick A natural (440Hz) as the tonic, we have a scale containing the following frequencies/pitches:

 440.000  A
 462.000  A#
 495.000  B
 528.000  C
 550.000  C#
 586.667  D
 616.000  D#
 660.000  E
 704.000  F
 733.333  F#
 792.000  G
 831.111  G#
 880.000  A

Any scale in which the ratio of any note to the tonic is an integer ratio is called a scale of just intonation. These scales have a very natural-sounding quality to them.

This is the common western scale of just intonation; other scales of just intonation exist, such as Indian raga scales.

The problem with just intonation is that it is very difficult to achieve in any stopped or fretted instrument. The difficulty is subtle, but it means big headaches. For example, the interval of a major second is the "whole step" so common in the western tradition. It defines the distance between A and B, or C and D, among others. The interval of a major third defines the distance between two notes with two "whole steps" between them; for example, the distance between C and E or F and A.

If this is true, then the major second of a major second (that is, two whole steps from a given note) should equal the major third (two whole steps from a given note), or:

   (X / 495) = (9 / 8)

X should equal 550. But instead, X is 556.875. What has happened here? Well, A(440) was the initial tonic of the scale, and all the intervals we defined above meet the integer ratio condition. But then we took a different note (495) as the tonic, and computed the major second of that. So in effect, we used two different scales and found that after a whole step from the tonic in each case, we end up with a pitch that isn't in the other scale.

This is the problem. Any given scale of just intonation must be tuned to a tonic, which is fine if you only want to play in one scale or "key". However, you have to retune the instrument every time you modulate keys. As many classical composers (and pop ballad writers) will tell you, this has a way of limiting your expressive power.

So what's a keyboard manufacturer to do? The answer is simple: make one note in tune, and space all the other notes equally (logarithmically equally, anyway). This is what happens on most fretted instruments and keyboard instruments. Now, instead of calculating pitch with integer ratios, we just plug an interval into the following equation:

   P = 440 * 2 ^ (n / 12)

where n is the number of half steps sharp you want to go (and hey, guess what, negative numbers work as expected; (n == -3) finds the pitch of the major sixth below A440). We call this approximation a scale of even (or equal) temperament, since the distance to any other note is independent of (and consistent across) key centers. This throws everything very slightly out of tune. Observe:

Note    Just Pitch      E.T. Pitch (approx.)    Error (%) 
A       440             440                     0.0
Bb      462             466.16                  +0.9
B       495             493.88                  -0.2
C       528             523.25                  -0.8
C#      550             554.37                  +0.7
D       586.6-          587.33                  +0.1
D#      616             622.25                  +1.0
E       660             659.26                  -0.1
F       704             698.46                  -0.7
F#      733.3-          739.99                  +0.9
G       792             783.99                  -1.0
Ab      831.1-          830.61                  -0.1

As you can see, we're never more than 1% out of pitch, which most people can't hear. However, you *are* out of tune inherently, so when your instrument then goes further out of tune you sound *really* bad. The advantage, though, is that you get stopped instruments, which makes composition and playing much easier.

In this system the fifth tone ratio is about 1.4983 instead of 1.5, and the half-tone ratio is 1.059463 instead of 1.05. Only the octave is still 2:1. It was not easy when they first learned to tune the "well tempered clavier" to interpolate between tuning suggested by different keys. Tuning done by ear cannot achieve a semitone ratio that matches the twelfth root of two in six or seven digits.

Many classical composers wrote compositions for just-intonated instruments (wind instruments in particular). However, since these instruments couldn't re-tune to a new tonic, modulating the key of the piece created a tension; it sounded like you were still playing in the original key and wanted to return to it. Some people insist that playing the piece on a JI instrument is the only way to truly hear what the composer intended. Other music fans disagree.

This article (or a previous version thereof) was based on, used by permission of the original author (ajax).

see also Joseph Schillinger