source file: mills2.txt
Date: Sat, 14 Oct 1995 07:39:20 -0700


From: "John H. Chalmers" <non12@cyber.net>

From: mclaren
Subject: Tuning & psychoacoustics - post 20 of 25
---
As is now evident, the Fourier transform is at best poorly suited
to the analysis of real-world sounds.
Noise, inharmonic partials and radical phase changes insure that
many real-world instrument tones are poorly modelled as a
sum of pure sinusoids near-constant in magnitude and phase.
This is complicated by the fact that "...much of the characteristic sound of 
an instrument is in its transient regions, such as the attack portion of its 
tone." [Moorer, J. A., "Signal Processing Aspects of Computer Music - A 
Survey," Computer Music Journal, Vol. 2, No. 2, pg. 7]
To date, the aperiodic and chaotic attack portions of instrumental notes 
have consistently resisted analysis, as have the residual stochastic 
portions of the sound which cannot be analyzed successfully by current 
techniques: a variety of work-arounds are  generally used to "smooth-over" 
the radical phase and magnitude discontinuities generated by Fourier 
analysis of the attack transients, or to parametrize the chaotic-attractor 
"residual" and mimic it with some kind of bandlimited noise generator added 
to the resynthesized signal.
"Each...spectrogram tells how amplitude and phase vary as a function of 
frequency.  This process of analysis and resynthesis has been called a phase 
vocoder.  Serra made use of the process to his ends, taking successive 
spectra at intervals of around 10 milliseconds.
"Such successive spectra do not in themselves give a deep insight into 
musical sounds.  Serra's innovation was to use successive spectra in 
dividing the signal into two parts--and deterministic, or predictable part, 
and a stochastic, or unpredictable, noisy part.  The deterministic part Serra 
took to be clear peaks which in several successive spectra change just a 
little in apmlitude and phase.  This part of the spectrum Serra resynthesized 
by generating the individual sinusoidal compoennts whose amplitudes, 
frequencies, and phases changed with time in the fashion indicated by the 
successive spectra. (...) [Serra] replaced this [non-deterministic] part of the 
spectrum with a noise that had roughly the  same overall spectrum as his 
stonachastic part but that didn't match it in waveform.
"Serra tested this division of the signal into a deterministic and stochastic 
part...by listening separately to the deterministic and stochastic parts, and 
then adding them and listening to their sum. A piano sound reconstructed 
from the deterministic spectra alone didn't sound like a piano.  With the 
stochastic or noise portions added, it sounded just like a piano.  The same 
was true of a guitar, a flute, a drum, even the human voice." [Pierce, J.R., 
"The Science of Musical Sound," 2nd ed., 1992, pg. 106] 
Moreover, Fourier techniques work even to a rough  approximation only with 
a small class of harmonic-series instruments (Western brass instruments, 
double reeds and strings, the harp). 
"Most percussion instruments (drums, bells) are inharmonic. This means that 
to describe them with sinusoids aften requires a large number of such. 
There are some synthesis techniques for creating what often turn out to be 
quite convincing drum-like or gong-like sounds, but to date there are few 
analysis techniques that can be used with inharmonic sounds."  [Moorer, J. A., 
"Signal Processing Aspects of Computer Music - A  Survey," Computer Music 
Journal, Vol. 2, No. 2, pg. 7]
Bearing in mind that this puts all of Javanese and Bainese and Thai and most 
African and South American music off-limits to Fourier analysis (because 
these cultures  use mainly inharmonc instruments), the value of 
Fourier analysis becomes questionable in the context of world music.
The net result is that conventional FFT  analysis *sometimes* tells us 
*something* about what is going on in *portions* of *a few notes* played  by *a few instruments.*  
However, the Fourier transform is far from the universal mathematical 
Swiss Army Knife it has been touted as being.
For example, suppose we try to increase frequency resolution by taking an 
FFT of tens of thousands of points--we have only 2 ways of doing this.  [1] 
Add tens of thousand of zeroes padded at the end of each wavecycle, which 
merely refines the accuracy with each bin's frequency is specified but does 
not tell us anything about what's going on between the frequency bins 
(where most of the interesting and complex behavior of real-world 
intstruments takes place); or [2] we can extend our FFT over multiple 
wavecycles, which does give us some information about what's going on 
between frequency bins because the fundamental of the wavecycle is apt to 
change over the couse of several period--but this dodge lumps all the 
spectral changes in 2, 3, or more wavecycles into a single analysis frame 
and thus "smears out" the spectral changes of the sound in time. 
If this sounds like "Catch-22," guess what? 
It is. 
Heisenberg's Uncertainty Principle is actually an outgrowth of the basic 
characteristic of wave motion:  to wit, you can accurately measure 
*either*  the frequency *or *  the period of a changing waveform, but you 
cannot measure *both * precisely at the *same    time. * And when you 
increase the precision with which you measure the wavetrain's period, you 
consequently decrease the precision with you measure the wavetrain's 
frequency.
This has profoundly important consequences for the FFT.
[1] Increasing your time resolution (that is, making the FFT snapshots closer 
together in time) deceases your frequency resolution (because you must 
therefore take smaller FFTs over those smaller time-lengths). 
[2] Increasing your frequency resolution (that is, taking the FFT snapshot 
over a bunch of different spectrally-evolving wavecycles) decreases your 
time resolution (because all the spectral changes in each frequency line are 
lumped into a single frequency and averaged over the number of wavecycles 
you're looking at).
[3] Increasing the number of phase points, to give you a more precise 
measurement of the exact amount by which partial is detuned from the 
others, also increases the number of frequency bins--and to do this you 
must extend the FFT over a longer time-period, which in turn means you're 
lumping your phase changes together and averaging them out, which entirely 
defeats your purpose.
[4] Decreasing the number of frequency points, to narrow down the time 
window over which you take the FFT "snapshot," also decreases the number 
of phase points--which divides the fundamental frequency in a smaller 
number of divisions and defeats your purpose by making the measured 
frequency changes coarser in the time domain.
Because the discrete Fourier transform is cyclic and imposes an infinite 
periodicity (both supersonic and subsonic) on your spectrum, you must use a 
limited fixed sampling rate and band-limit your input samples to avoid 
aliasing (that is, to avoid the lowest supersonic and the highest subsonic 
frequency bins from bleeding into your audible frequency bins and 
contaminating them with inharmonic-sounding spurious garbage). 
But once you fix your samling rate, you've thrown out all information 
between samples. This means that there's no way to  reconstruct the detail 
in the waveform "between" the samples because it's gone.  You've dumped it 
out.  You've thrown out the baby with the bathwater.  The price you pay for 
perfect reconstruction of a signal is that the signal that spews out of your 
mathematical analysis algorithm is sometimes quite different from the real 
analog signal that came in.
"Sometimes" because if there's very little noise and the partials are almost 
perfectly harmonic,  you get good results with the Fourier transform even in 
its discrete version.
Alas, most sounds are *not*  noise-free and perfectly harmonic.
"If you have a hammer, everything in the world looks like a nail."  This is 
nowhere more true than of the Fourier transform.
Many writers have begun articles on tuning or acoustics with statements 
along the lines of: "Musical sounds are made up of sinusoidal frequency  and 
phase components..."  
No!
Wrong!
Completely false!
Classic error.
These people have mistaken the *map*  for the *territory.*
They have confused the *mathematical  model * of the  physical 
phenomenon with the *physical phenomenon * itself.
*Sounds * are displacements of air molecules.   
*Sinusoids * are ideal mathematical entities infinite in temporal extent 
and perfectly periodic.    
Sometimes one or another mathematical model works well in analyzing 
acoustical phenomena; other times they all work well, sometimes *none * 
yield useful results. 
Different mathematical techniques and different conceptual models are 
required for different acoustical phenomena, as Risset points out.
There is no "one size fits all."  Yet this this is exactly what the Fourier 
tykes would have us believe.
The universe exhibits what the mathematicians call "the inexhaustibility of 
the real."  Goedel proved this in 1930: the universe is ultimately more 
complex than any statements we can make about it mathematically.
Or, to put in another way, there are an infinite number of true but 
unprovable propositions.
Clearly proponentsn of this or that tuning system who write circularly-
reasoned arguments a la "that's the beauty of mathematics--it has an 
inescapable logic" haven't collided with the real world in the form of a 
sound that turns to junk when you Fourier analyze it.  
For example, breathy vocal sounds. Or flute multiphonics, or a cymbal clash, 
or gamelan bar note.
It's worth remembering Fourier never came up with his transform to solve 
the problem of frequency analysis. He used it as a clever dodge to solve the 
problem of heat comduction in a  metal bar.  
It worked well on that problem, but it has since been greatly extended--in 
some cases, overextended.
The wavelet transform, a probably superior mathematical method for 
analyzing acoustic phenomena, dates from only 1987.
Since then we've had very few useful & powerful transforms.  
Mathematical progress has been slow.  The Walsh Transform is no big help--
it is acoustically "brittle" and is too few Walsh components are used in a 
reconstructing a sound, the output is intolerably buzzy and distorted-
sounding.  Bart Kosko's fuzzy Kalman filter promises some insight into 
acoustic transformations but the amount of  ground gained has been 
small...and the rate of progress slow.
In the end, the real world is very *very*  VERY complex.  
Our linear parametric mathematical models models apply with accuracy only 
to an tiny class of physical phenomena.  
It has become increasingly (distressingly!) clear over the last few years 
that sounds are not "infinite harmonic wavetrains containing perfectly 
harmonic overtones with a few stochastic ergodic noise components mixed 
in."
Rather, real computer analysis of actual instrument timbres shows with 
brutal clarity that real-world sounds run the entire gamut from chaotic 
strange attractors to pure noise to semi-noise/semi-pitched to strongly 
pitched narrowband noise to mostly harmonic sounds with a rock-steady 
fundamental.
No one mathematical analysis technique is adequate to model all these 
ranges of behavior.  And when you realize that something as simple as the 
flute can exhibit the entire range (overblown multiphonics, semi-pitch 
"breathy" whispery notes, flutter-tongued notes, strongly pitched notes in 
the lower registers with a steady fundamental) you start to realize  that 
the FFT is useful for only a very limited class of musical sounds. Thus it is 
hardly surprising that the to-date largely FFT-obsessed model of the ear as 
frequency analyzer has had extremely limited success in explaining real-
world musical preferences, the effects of real-world intervals on listeners, 
and the real-world propensity of composers, performers and audiences to 
prefer intervals not well defined by the integer eingenvalue vocabulary  of 
the Fourier transform.
--mclaren
 

Received: from eartha.mills.edu [144.91.3.20] by vbv40.ezh.nl
           with SMTP-OpenVMS via TCP/IP; Sat, 14 Oct 1995 17:30 +0100
Received: from  by eartha.mills.edu via SMTP (940816.SGI.8.6.9/930416.SGI)
	for <coul@ezh.nl> id IAA19603; Sat, 14 Oct 1995 08:30:29 -0700
Date: Sat, 14 Oct 1995 08:30:29 -0700
Message-Id:  <9510140829.aa01987@cyber.cyber.net>
Errors-To: madole@ella.mills.edu
Reply-To: tuning@eartha.mills.edu
Originator: tuning@eartha.mills.edu
Sender: tuning@eartha.mills.edu