The Bohlen-Pierce Site
Interval gestalt and harmonic scales

Last updated: January 22, 2010


The text below does not claim to be in any way scientific. It is rather the result of a layman's attempt to come to terms with the question of what is the essence of a musical scale, and more specific, what is the essence of a scale that permits creation and performance of harmonic and finally tonal music. Trying to find a model that explains harmony in the face of the mysterious ways of the human brain, the author decided to rather stick with the obvious, if at all the obvious would be sufficient to support an explanation. It's like staying with Bohr's model of the atom rather than attempting to wrestle quarks and super-string theory if all you want is to understand basic chemistry. If any justification for that simplistic approach should be required: it has paved the way to the discovery of what is now called the Bohlen-Pierce scale.

1. Exponential sequence of tone pitches ("Principle of equidistance")
2. Gestalt compatibility between intervals ("Principle of consonance")
3. Sensory dissonance

4. Consequence

Views of what the author considered essential properties of a harmonic scale can be found by combing through the early papers [1,2,3] on what was later dubbed the Bohlen-Pierce scale. In a condensed form these views can be reduced to the necessity to conform with two independent aesthetic principles which are based on the physiology of middle and inner ear:

Exponential sequence of tone pitches ("Principle of equidistance")
Gestalt compatibility between intervals ("Principle of consonance")

A third criterion, rather than a principle, is taking into account sensory dissonance. The numerically most simple scales abiding by these axioms are the Western 12-tone scale which in its just version is a reasonable approximation to

fn/f0 = 2n/12

and the Bohlen-Pierce scale (BP) with the fairly close approximation of its just form to

fn/f0 = 3n/13.

1. Exponential sequence of tone pitches ("Principle of equidistance")

Fig.1 Schematic of a rolled-out cochlea and basilar membrane,
depicting a resonant traveling wave.
© University of Heidelberg

This very much simplified schematic of a rolled-out cochlea and the basilar membrane contained therein nevertheless gives a useful visual impression of the resonant traveling wave that is generated by a certain tone. High frequencies cause resonances near the entrance of the cochlea where the basilar membrane is narrow and under high tension while low frequency resonances are located near the wide, floppy end of the membrane. In humans, the basilar membrane is typically 35 mm (1.55 inch) long.

Fig. 2 Cut through cochlea, basilar membrane and organ of Corti (top left)
© University of Heidelberg

Stimulated by these frequency-dependent resonances are the neurons of the organ of Corti that runs along the center line of the entire basilar membrane. Donald D. Greenwood [4,5,6], basing on research conducted by Georg von Békésy, found experimentally that within a critical bandwidth the ear cannot distinguish between two frequencies that are very close to each other. Then, integrating across the critical bandwidth distribution along the organ of Corti, he determined the following equation for the organ's frequency sensitivity:

f = 165.4 Hz (102.1·x/L - 0.88).

Hz (Hertz) is the expression for one period per second, L is the length of the human basilar membrane, and x is the distance of the sensitive point from the (floppy) end of the membrane. Thus, by solving the equation for x, we can calculate that middle A (440 Hz) excites a point about 9.2 mm from the end of the membrane, its octave (880 Hz) does this at 13.2 mm, the next octave (1760 Hz) at 17.7 mm, and the following octave (3520 Hz) at 22.4 mm. We see that stimulation points of the low octaves sit slightly closer together than those of the high octaves. If we ignore this trick of nature to pack slightly more frequency range into the low end, we can use Greenwood's equation to express the distance xn- x0 of the stimulation points of two tones fn and f0, allocated on steps 0 and n of a scale:

fn/f0 = 102.1(xn- x0)/L .

Comparing this empirically found and totally physiology-based equation with the frequency relation of the same tones expressed in Western 12-tone equal temperament,

fn/f0 = 2n/12,

we find that the two expressions become identical if we substitute basilar membrane distances by tone steps (or vice versa):

2.1(xn- x0)/L = (log 2/log 10)(n/12) = 0.301 n/12.

From these observations it is not far to the conclusion that our sense of hearing judges equal distances between excitation points of the organ of Corti as practically equal steps of pitch, while actually their frequency relation is exponential. Thus in the attempt to find musical scales consisting of what appears to us as approximately equal steps between tones, we in reality create scales with an exponential sequence of pitches. Most known musical scales share this phenomenon, so that it seems justified to formulate a "principle of equidistance" (exponential sequence of tone pitches):

Based on the exponential frequency sensitivity characteristic of the organ of Corti, a musical scale is perceived as aesthetically pleasing if its tone steps follow or at least closely approximate the equation

fn/f0 = Kn/N

with fn meaning the pitch of step n of the scale, f0 the fundamental tone, K the frame interval and N the total number of steps.

In scales for monodic music (one voice only at any time), K and N can assume quite a range of values.

However, this condition is just necessary but not sufficient for the design of scales for polyphonic music which at the same time have to abide by the following principle, too.

2. Gestalt compatibility between intervals ("Principle of consonance")

Any non-linearity in the path of a composite acoustical or electrical signal changes the amplitudes of the signal's frequency components, be it fundamental frequencies or harmonics, and it generates intermodulation products, called combination tones in acoustical signals. In the human auditory system non-linearities abound; otherwise the system would not be able to cover the enormous dynamic range of our hearing and at the same time prevent damage to its elements. The elements that are most easily to understand as acting in a non-linear way are the system of tympanic membrane and ossicles in the middle ear (Fig. 3) and the basilar membrane in the inner ear (see Fig. 1 above).

Fig. 3
Middle ear system
(Clockwise from bottom left: Tympanic membrane, Malleus, Incus, Stapes)
© University of Heidelberg

The artist's rendition in Fig. 3 makes the inevitability of non-linearity in the middle ear system almost painfully obvious.

The most commonly experienced combination tones, originated by two tones of the frequencies f1 and f2, appear at the frequencies f2 - f1 ("difference tone") and 2f1 - f2. The latter is sometimes described as being even louder than the difference tone. In any case: the existence of these tones shows that our auditory system "suffers" from both quadratic (f2 - f1) as well as cubic non-linearity (2f1 - f2). Combination tones of comparable amplitudes should theoretically also appear at f2 + f1and at 2f2 - f1, however, since these two frequencies are higher than the original tones, they have a tendency to become obscured by the original tones and their harmonics.

We usually do not perceive any combination tones consciously, but they are present, as even a person not trained in analyzing acoustical signals can find out in two simple experiments. (It's a little like dealing with a problem of astronomy: If you can't see a planet, just observe the irregularities in the paths of the others.) Using a multi-tone generator (software versions are inexpensive), setting the first oscillator to a sine tone at a fixed frequency f1 of 200 Hz and letting the second oscillator produce a slowly swept sine signal f2 between 370 Hz and 430 Hz, reveals the passing of the difference tone f2 - f1 (quadratic non-linearity) past f1 by a clearly audible sequence of beating - smooth signal - beating. Then fixing frequency f2 at 400 Hz and adding a third oscillator with a sine tone of the frequency f3, slowly swept between 570 and 630 Hz, causes the same sequence again, at a lower loudness this time, but unambiguously discernible. It betrays the passing of the cubic non-linear product f1 - f2 + f3 past f2. Since the sine tones possess no overtones, the observed beats can only be interpreted in this way. Simple experiments like these prove that combination tones not only exist, but that their amplitude is large enough to cause beats with the original tones.

In the view of this author, the intervals which the combination tones form with the original tones and among each other are detected by the auditory nervous system. If the two original tones happen to form exactly or at least approximately an interval which can be described as the ratio of products of small primes, e.g. 2:3, 3:4, 3:5 etc., the combination tones contribute to the gestalt impression of the interval. The following diagram (Fig. 4) shows an example.

Fig. 4 The just fourth (3:4) and its closest combination tones
Vertical axis: Normalized amplitudes (in dB)
Horizontal axis: Normalized frequency (linear)

Depicted are the two tones of a just fourth 3:4 (at frequencies f1 = 3 and f2 = 4 of a normalized frequency scale) after passing through an invented transfer system with a relatively modest quadratic as well as cubic non-linear amplitude characteristic which reduces their original amplitude from 60 dB to about 59 dB. For greater clarity, the original two tones are sine tones, i.e. they possess no overtones, and the transfer system is predominantly linear with just one each quadratic and cubic element added.

We observe that this transfer creates several previously not present tones at different frequencies. The "tone" at frequency 0 is naturally inaudible; it represents a slight change in air pressure for the duration of the interval. At frequency 1 we notice the difference tone f2 - f1 of the original interval. Its amplitude in this example is about 32 dB lower than that of the interval tones and certainly not easy to detect. At frequency 2, however, we find, as a result of the non-linear characteristic's cubic element, 2f1 - f2 with an amplitude only 15 dB below that of the interval tones, and that should be fairly audible. At frequency 5 we meet 2f2 - f1 at a similar amplitude, but difficult to detect for reasons already explained. Finally at frequency 6 there appears 2f1 as a previously not existing harmonic.

We recognize the face of a person not because it displays obvious elements like mouth, nose, eyes and ears. There are much subtler features that escape our consciousness but add gestalt attributes to a face, and that our brain uses for identification. In a similar way we do not identify intervals because of their timbre. We recognize them independently from the instrument that produces them; we even recognize them when they consist of mere sine tones. But there are the combination tones that provide gestalt elements to the interval, and the author holds that the brain uses this gestalt enhancement in a twofold manner:

First, the gestalt of the interval has changed from two single tones into kind of a harmonic series which is based on frequency 1. The author believes that the human brain is able to detect this feature and that an interval appears the more consonant, the more this harmonic series nears completion. (Not surprisingly, the most complete harmonic series among the intervals forming the Western 12-tone scale is generated by the octave, followed in this order by the fifth, the fourth, the sixth, the major third, the minor third, and so on, and that is exactly how we rate their consonance.)

Second, and for the development of musical scales perhaps even more important, this series of tones contains several other intervals, as for instance in the above example three octaves (1:2, 2:4, 3:6), two fifths (2:3, 4:6), a sixth (3:5), a major third (4:5) and a minor third (5:6). Consciously, we cannot filter these intervals out of the sound cluster that we hear, but subconsciously our brain recognizes them and therefore is prepared for them if in the context of a musical composition they appear on their own as centerpieces of similar clusters. The brain considers these intervals as compatible (or related), and it is therefore important for a musical scale to contain them as building blocks. (It is, by the way, not deciding whether we start with the forth, like shown here, or with the the fifth, the major third or the minor third; the resulting harmonic series contain predominantly the intervals which form the Western 12-tone scale. That changes, however, if we start with the sixth (3:5); in that case the harmonic series, now mainly consisting of odd harmonics, produces predominantly the intervals of the Bohlen-Pierce scale.) Intervals which are not compatible with these harmonic series are judged by the brain as out of place.

This altogether results in a "principle of consonance" (gestalt compatibility between intervals) for scales suitable for polyphonic, harmonic music:

The non-linearity of the ear, mainly located in middle ear and basilar membrane, changes the gestalt impression of intervals and chords by causing combination tones and altering harmonics. These thus modified composite sounds reduce the choice of tones, which can form a usable scale for harmonic polyphonic music, to the members of a small range of intervals with compatible gestalt features.

Under realistic conditions, the original interval tones would not be free of overtones. If these partials are harmonics, the basic effect doesn't change, because the harmonics simply reinforce the harmonic series that has been caused by non-linearity. 2f1 - f2, for example, is now supported by the second harmonic of f1, too, as the difference tone between this second harmonic and f2. Likewise, all harmonics of both tones interact, strengthening the harmonic series, under condition that the original interval can be described as the ratio of products of small primes.

But this example also demonstrates where the "principle of consonance" is no longer applicable. For if the partials are not harmonic, as in the case of gamelan instruments for example, they and the combination tones caused by them destroy the harmonic series generated by the two original frequencies. An entirely different aesthetical approach is required to cope with this situation, culminating in non-harmonic scales like pelog and slendro.

3. Sensory dissonance

The two principles described above are not the whole story. Depending on the timbre of the available voices, sensory dissonance sets limits to the choice of intervals which can be used in a scale for polyphonic music, as described by William A. Sethares [7] based on work of predominantly Hermann von Helmholtz [8] and R. Plomp and W. J. M. Levelt [9]. Partials or original tones and combination tones in less than critical bandwidth distance to each other can cause unpleasant beating sensations or roughness. Sensory dissonance calculations show which intervals are relatively free of these sensations, and thus can provide good guidance for choosing the members of a scale fit for a specific set of instruments. This is the third independent criterion regarding material for harmonic scales.

However, both consonance and dissonance have a significance of their own; they are not each others opposites. Like missing beauty does not necessarily mean ugliness, fading consonance does not simply morph into dissonance. Vice versa, the absence of sensory dissonance does not unavoidably confer the impression of consonance. A consonant interval can still maintain much of its character even when being affected with a degree of beating sensations and even roughness, and just dodging sensory dissonance does not yet create a useful musical scale.

4. Consequence

"Complex models are rarely useful, except for those writing their dissertations."
Vladimir Igorevich Arnold

Thus a simplistic model of the process that generates scales, which can be used to create harmonic and tonal music, may be sketched like this:

 Organizing principle

 Principle of consonance

 Principle of equidistance

 Location of origin

 Middle ear & inner ear

 Inner ear

 Responsible organs

 Tympanic membrane, middle ear ossicles &
basilar membrane

 Organ of Corti

Deciding physiological organ property 

 Amplitude non-linearity

 Exponential frequency sensitivity

 Resulting psycho-acoustical sensation

 Gestalt enhancement through combination tones results in sensation of
consonance &
compatibility of intervals and chords

 Exponential sequence of pitches appears to be equidistant

 Consequent demand on scale structure

 Accommodation of a maximum of compatible intervals

 Approximately exponential sequence
of pitches
fn/f0 ~ Kn/N

 Result of combined principles

 Scales usable for harmonic and tonal music
(if limitations posed by sensory dissonance are respected)


Scales for this purpose can be developed by a variety of means but will fail to be attractive if the result violates either of the two principles described above. Yet, simultaneously abiding by both of them limits the number of those scales to a small pool, even when allowing for some tolerance. The numerically most simple of these scales are the Western 12-tone scale which in its just version is a reasonable approximation to

fn/f0 = 2n/12

and the Bohlen-Pierce scale (BP) with its fairly close approximation of its just form to

fn/f0 = 3n/13.


[1] Bohlen, Heinz: Manuscript, untitled, undated, pencil, 24 pages (in German). Hamburg, early 1972.
Original archived at
Huygens-Fokker Foundation (Stichting Huygens-Fokker), Amsterdam.
The paper describes the derivation of the 13-step scale (later BP) in both a just and equal-tempered form, in conformance with two basic, independent principles: consonance with combination tones, and approximate equidistance of scale steps. The presentation comprises a 13-step chromatic and four 9-step diatonic versions.

[2] Bohlen, Heinz: Die Bildungsgesetze des 12-stufigen Tonsystems und ihre Anwendung auf einen Sonderfall.
Manuscript, ink, 50 pages, Hamburg, July 1972.
Original archived at Huygens-Fokker Foundation (Stichting Huygens-Fokker), Amsterdam.

Mainly an expanded and refined version of [1], containing also first considerations of realization essentials (tone names, notation, plans for an electronic organ).

[3] Bohlen, Heinz: Versuch über den Aufbau eines tonalen Systems auf der Basis einer 13-stufigen Skala.
first version, typed, 7 pages, Hamburg, Dec. 1972.
Manuscript, second version, typed, 9 pages, Hamburg, July 1974.
Originals archived at Huygens-Fokker Foundation (Stichting Huygens-Fokker), Amsterdam.
These are mainly abstracts of [2], intended to inform a selected readership about the 13-step scale.

[4] Greenwood, D.D.: Auditory Masking of the Critical Band. Journal of the Acoustical Society of America, vol. 33, pp. 484–502, 1961a.

[5] Greenwood, D.D.: Critical Bandwidth and the Frequency Coordinates of the Basilar Membrane. Journal of the Acoustical Society of America, vol. 33, pp. 1344–1356, 1961a.

[6] Greenwood, D.D.: A cochlear frequency-position function for several species - 29 years later. Journal of the Acoustical Society of America, vol. 87, pp. 2592–2605, 1990.

[7] Sethares, William A.: Tuning, Timbre, Spectrum, Scale. Springer-Verlag London Limited, 1998, pp.165-188

[8] Helmholtz, Hermann von: On the Sensations of Tone (Die Lehre von den Tonempfindungen). Dover Publications, New York 1954 (1885), pp.152-233

[9] Plomp, R. and W. J. M. Levelt: Tonal Consonance and Critical Bandwidth. J. Acoust. Soc. Am. 38, 1965, pp.548-560

([4,5,6] retrieved from "")

Back to the top of this page