source file: mills2.txt Date: Sat, 23 Sep 1995 08:39:53 -0700 From: "John H. Chalmers" From: mclaren Subject: Misinformation about psychoacoustics & an upcoming series of 25 posts --- The psychoacoustic information posted on this forum seems to stop around 1940. This would be of little concern if psychoacoustics had but a tangential affect on microtonality; alas, such is not the case. The workings of the ear-brain system are crucial to real-world music. Thus *someone* must point out the veritable deluge of incomplete information, inaccuracies and outright canards that has characterized the psychoacoustics- related posts on this forum. And since no one else has bothered to do it... This is no one's fault. As Enrique Moreno has pointed out in his 1992 book "Expanded Tunings In Contemporary Music," (The Edwin Mellen Press, pg. 107), during this century knowledge about the ear/brain system has grown *much* faster than music professors' awareness of it, or musicians' understanding of it. Thus in pointing out the pervasive inaccuracies in the psychoacoustic (mis)information posted on this forum, my intent is to engender a quest for truth, rather than impugn anyone's credibility or good intentions. It's important, however, that the *facts* (as opposed to the myths) about the ear/brain system come out. Differences of opinion are one thing; but overt misinformation is another. The subject of psychoacoustics, as Larry Polansky pointed out in his 1985 guest editorial in 1/1, is one of the most fruitful areas of musical exploration for the late 20th century. Anyone who doubts this need only listen to the music of the best computer composers around--almost without exception Risset, Wessel, Chowning, et alii are deeply involved in psychoacoustic research. Harry Partch made the same point about the importance of psychoacoustics in "Genesis of A Music." Alas, as previous posts have shown, Partch quotes no psychoacoustics references dated later than 1947--and most of what Partch said about the ear/brain system turns out to be poppycock. Subsequent psychoacoustic research has largely invalidated his claims about psychoacoustics. While my earlier posts to this effect 8 months ago spurred a brief frenzy of name-calling and denial-for-the-sake-of-denial, the frenzy now seems to have abated. And most of the people who initially reacted like vampires in a crucifix factory to my posts about Partch's errors on psychoacoutics now tacitly (sometimes overtly) recognize that, yes, Partch's statements on the subject are claptrap. This does *NOT* mean that Partch was wrong in stressing the importance of psychoacoustics in microtonality. Harry Partch was a genius, far ahead of his time. He simply had the misfortune to study a psychoacoustic literature which Max Mathews' acoustic compiler of 1959 was to render obsolete virtually overnight. Partch's essential emphasis on psychoacoustics was and is crucial to an understanding of the musical uses of microtonality. Good music *can* be composed even if one doesn't know how a given tuning interacts with the human auditory....but common sense tells us it's an awful lot *easier* to compose microtonal music that does what you want it to do if you know something about what's going on in the ear/brain system. In short, the limits we put on the musics we make are a result of what we understand about how the ear hears. If we wish to remove the limits on the microtonal music we can imagine, we must first remove the limits on our understanding of the ear. Thus for the next month or so John Chalmers will be uploading a series of 25 posts on tuning and psychoacoustic by Your Humble E-Mail Correspondent. This series will bear on real music and real tunings as much as on the ear/brain system, and should prove interesting to at least some of you. Even if it doesn't, this upcoming series of posts is sure to provoke enough discussion to clarify the underlying psychoacoustic facts. Before starting that series, permit me to give a few examples of the kind of misinformation that's been drifting around on this forum since day 1: In Topic 1 of Tuning Digest 427 Johnny Reinhard wrote: "Since Julian Carrillo recognized the physical structure of the ear as having `hairs' that corresponded - like a harp - to specific pitch frequencies, with corresponding octaves, I have wondered about these hairs or `cilia.' The New York Times wrote about them a few weeks ago and I have yet to hear about an exact number of hairs-per-octave." [Johnny Reinhard, Topic 1, Digest 427] Compounding the confusion, David Doty, David Worrall and William Alves chimed in to "correct" Johnny's post. Doty claims: "I think there's a misunderstanding here. As I understand the matter, a given frequency excites a fairly wide region on the basilar membrane, and hence, a considerable number of hair cells. Maximum excitation occurs at the center for the band, with excitation diminishing toward the ends." [David Doty, Topic 1, Digest 429] David Worrall claims: "Futhermore, when two tones are close together these bands of cells overlaps, causing an inability to distinguish the two frequencies: hence difference between frequency and pitch. Related to the Critical Band..." [David Worrall, Topic 3, Digest 429] And William Alves claims: "This is true. Apparently along the neural pathways to the brain, the middle frequency is calculated, and that is what we hear." [William Alves, Topic 4, Digest 429] The problem is that each of these statements, while in general incorrect, contains enough crumbs of fact to bamboozle the unwary reader. Johnny Reinhard's idea that each hair cell on the basilar membrane corresponds to a given frequency isn't correct, but it has an important germ of truth in it. Why couldn't each hair cell on the basilar membrane corresond to a given musical pitch? First, because there aren't enough hairs to account for human sensitivity to small pitch changes. While there are 15,000 hair cells in the average human ear, only 3000 inner hair cells are responsible for pitch detection; the other 12,000 outer hair cells serve a supporting role. The 10-octave range of human hearing divided by 3000 inner hair cells gives a a just noticeable frequency difference limen of 4 cents, but measured human frequency sensitivity for prolonged test tones is much finer, down to + or -1 Hz for long-held harmonic test tones. In fact the theory that individual stereocilia are "tuned" to particular notes is not a new idea--it was first advanced by Herophilus and Erasistratus around 490 B.C. and then by Aristotle in 344 B.C. The notion imported into Europe in modern times by Joannes Muelle in his 1838 text Handbuch der Physiologie, in which he called it the doctrine of "specific nerve energies." While this theory of hearing is now known to be incorrect, at the time it was a significant advance over Empedocles' theory of "implanted air." Thus, while incorrect, "the doctrine of specific nerve energies prepared the way for interpretations of new anatomical discoveries that came with the development of better methods, such as the improved compound microscope that appeared around 1830." ["Hearing: Physiological Acoustics, Neural Coding and Psychoacoustics," W. Lawrence Gulick, George A. Gescheider and Robert D. Frisina, Oxford University Press, 1989, pg. 59] So while this notion comes from antique sources, there's an important element of truth in Johnny's concept of the ear. As Pierre Buser & Michel Imbert point out in "Audition," (MIT Press, 1995), "some neurons [in the auditory nerve] produce strong signals when presented with tones in a particular range, but do not respond to tones in other frequency ranges. A small proportion of neurons emit strong signals when two different frequencies are sounded togehter but respond weakly when either of these frequencies is sounded alone. Some neurons are activated best by sounds at particular amplitudes and less well at lower or higher amplitudes. For yet other nerve cells, the higher the amplitude of the sound, the stronger the signal that is produced, until some saturation point is reached." Thus Johnny's specific picture is inaccurate, but it's founded on a deep insight: certain nerve cells [NOT hair cells--we're talking about neurons in the auditory nerve and other brain centers] in the ear/brain system *do* react to specific frequencies. Thus, crucial elements of the ear/brain system *are* in fact sensitive to specific frequencies (and amplitudes, and specific frequency-differences, etc.), contrary to the implications of Doty, Alves, Worrall, et al. Johnny's post also raises an important question--one which Doty, Worrall and Alves conveniently ignore. To wit: If the human ear/brain system performs a purely mechanical Fourier analyzer of input sounds, what need is there for all these different kinds of neurons in the auditory nerve, the medial geniculate nucleus, the Sylvian fissure, etc.? It is well known that the description of the ear to which Doty, Worrall and Alves (known as the place theory of hearing) refer is incomplete and conflicts with much of the psychoacoustic evidence. "A second difficulty with the place theory lies in the fact that, in complex sounds, components are often heard that are present in the Fourier analysis. or loudness judgments of components may be made which do not agree with the amplitudes obtained for Fourier components. It is certainly true that there are phenomena which cannot at the present time be explained by the place theory of hearing." [von Bekesy, Georg, "Hearing Theories and Complex Sounds," Journ. of the Acoust. Soc. Am, 35(4), April 1963, pg. 589] Be it noted that von Bekesy is the researcher most responsible for compiling experimental evidence for the place theory. (In fact he won the Nobel prize for it.) The place theory describes the ear as a mechanical Fourier analyzer. This model of the ear does not explain many of the observed characteristics of the ear/brain system. For example, given the width of regions of maximal stimulation along the basilar membrane, human frequency discrimination should be quite coarse--yet tests should that we can easily detect very fine changes in frequency. "For a sinusoidal tone, the locus of maximum stimulation [of the basilar membrane] changes regularly with frequency only from about 50 through 16,000 Hz, so that place cannot account for low pitches from 20 through 50 Hz. Further, the just noticeable difference (jnd) in frequency of sinusoidal tones apears to be too small to be accounted for by spectral resolution along the basilar membrane. Figure 3.5 shows that for 500 Hz the critical band width is about 100 Hz, yet jnd's having values less than 1 Hz have been reported (Moore, 1974; Nordmark, 1968). While there are mechanisms based on place which have been proposed for discriminating tones separated by considerably less than a critical band (Bekesy, 1960; Tonndorf, 1970; Zwicker, 1970), none seems capable of handling jnd's as small as those reported by Nordmark and Moore." ["Auditory Perception: A New Synthesis," Warren, R. A, Pergamon Press, New York, 1985] Moreover, according to the place theory the ear should be insensitive to phase. But both Newman Guttman's 1959 data and Robert M. Green's 1973 experiments show this is not the case. Most damning of all, if the ear merely performs a mechanical Fourier transform (as Worrall, Doty and Alves claim), how do we account for the ear's ability to extract a missing fundamental? Schouten's siren experiment, Seebeck's fifth and sixth click series, and Stumpf's experiments all contradict this model of the ear as a simple mechanical Fourier analyzer. In short, "The discovery of second order effects in auditory processing, such as the perception of phase changes, beats of mistuned consonances and fundamental tracking, has had a great impact on the theory of hearing. Indeed, these effects cannot be explained appropriately with the conventional `place' theory." [Roderer, Juan, "The Physics and Psychophysics of Music," 1973, 2nd, ed., pg. 41] Moreover, Doty, Worrall and Alves leave unanswered the question of how the ear conjures up a sensation of pitch from a flat noise spectrum. According to a Fourier model of the ear, a broad-band noise with a flat spectrum *cannot* produce a sensation of pitch. It is by definition UNpitched. Yet, as we all know, white noise generates a weak sensation of pitch. Eberhard Zwicker summarizes this point concisely: "While the pitch of the noises so far can be traced back to a spectral feature, namely a distinct change in the spectrum, broad-band noises with flat spectra also produce pitch sensations." [Zwicker, E. and H. Fastl, "Psychoacoustics: Facts and Models," 1990, pg. 121] This conclusion is so antithetical to the Fourier description of the ear as to that it's clear that "Auditory physiologists divide into three groups, namely those that think only temporal information [from the firing of neurons] is used, those that think only place information [location on the basilar membrane] is used, and an eclectice group whoe supporse that temporal information is used at low frequencies and place information at high. (..) There are several lines of evidence for and against these theories, none of which is conclusive." [Pickles, James. O, "An Introduction to the Physiology of Hearing," Academic Press, 1988, pg. 271] Doty, Worrall and Alves pointedly ignore this controversy between psychoacoustic researchers--charitably, we'll assume there was no duplicitous intent. Regardless, the net effect of their posts is to create a false impression. Moreover, the arguments put forward in their posts hide a multitude of contradictions and inconsistencies by giving only a few scraps of fact and suppressing the rest. Alves claims that the center regions of the basilar membrane excitatory regions are "calculated"--implying the involvement of higher brain regions. This is one of the three competing hypotheses of the ear, today known as the "pattern recognition" model. What Alves does NOT mention is that "calculation" by neurons cannot be whole story. Recent experiments with cochlear impants in the deaf have cast grave doubts on the pattern recognition model of hearing, according to which the ear "calculates" pitches from volleys of neural impulses produced by excitation of the basilar membrane: "If we believe the extreme position that at low requencies information is carried purely by the temporal pattern of nerve impulses, then periodic electrical stimulation should produce faithful auditory sensations and good discrimination of frequencies. The results of electrical stimulation have on the whole been disappointing for such a prediciton. In only a few few cases do electrical stimuli seem to produce clear tonal sensations. A typical report is that tones sound like "comb and paper" (e.g., Fourcin et al, 1979)." [Pickles, James O., "An Introduction to the Physiology of Hearing," 1988, Academic Press, pg. 316] And in any case Alves' notion of "calcuation" of center frequencies from basilar membrane maxima opens up a Pandora's box and undermines the very place theory he esposes. If higher brain centers supervene to "calculate" pitch, the Fourier analysis activity of the basilar membrane becomes suspect as adequate explanation for human hearing. And in fact an approximate Fourier analysis is only performed on the lower six harmonics, and then only above 5 khz... casting strong doubts about the importance of the entire mechanical Fourier analysis activity of the basilar membrane. But if basilar membrane activity *isn't* the whole picture, how to account for the Zwicker tone? How to explain Wessel's "streaming" phenomenon? How can we then explain Risset's auditory illusions? The invocation of higher brain centers leads to endless amounts of trouble for the place theory--yet, as Alves' post shows, it cannot be avoided if we want to explain many auditory phenomena. A classic catch-22. If indeed higher brain centers *are* involved in hearing, we're now on the slippery slope. Because if the ear "calculates" frequencies rather than perceiving them via von Bekesy's model of mechanical Fourier analysis, why not posit a greater role for "calculation"...to the point where the ear's mechanical Fourier analysis role becomes unimportant? Since it only operates on the lower six harmonics and then only above 5 Khz, its importance is surely questionable to begin with. Recent models of hearing emphasize the operation of higher brain centers and discount the significance of the basilar membrane in hearing. ial optimum-processor model all tend to bely the purported importance of the basilar membrane in favor of higher-level processing. Experimental data also show that higher brain centers are crucial to the perception of pitch: A. J. Dowling's article "The 1215-Cent Octave: Convergence of Western and Non-Western Data on Pitch Scaling," Abstract QQ5, 84th Meeting of the Acoustical Society of Maerica, Friday, December 1, 1971, pg. 101, adduces a wealth of data proving that the preference for stretched octaves is universal. Sunderg and Linqvist, in "Musical Octaves and Pitch," Journ. Acoust. Soc. Am., Vol. 54, No 4, 1973, pp. 973-929, and Lichte's, Ward's, Corso's and Burn's research also show the same result: "As a rule, the perceptual octave corresponds to a fundamental frequency ratio exceeding 2:1." [Sundberg, J. and Lindqvist, J. "Musical Octaves and Pitch," JASA, 54(4), 1973, pg. 978] All of these data broadly contradict the predictions that follow from a model of the ear as Fourier analyzer, and instead tend to support the hypothesis that the ear is a learned-response system. Moreover, the known data showing that when Fourier analysis operates along the basilar membrane (much of the time it doesn't; other ear/brain mechanisms operate, depending on the type of auditory input) only for the lower 6 harmonics raises the question: How can the ear decipher the fundmental pitch of a harmonic sounds whose lower 6 harmonics have been completely filtered out? Schouten and Seebeck have shown that this occurs--the Fourier-analysis view of the ear cannot possiply explain it. Again, Doty, Worrall and Alves conveniently slither & slide over this point without so much as a word of explanation. And there's more: the brain employs feedback networks between auditory neurons in many of different brain regions. Some feedback loops run from the Sylvian fissure, still others from the medial geniculate body. And in all cases neural filaments radiate from fourth-order auditory neurons in the cerebral cortex back to the geniculate body, thence to the auditory neurons of the inferior colliculus, back even farther to the primary auditory neurons of the cochlear nucleus. If the ear is a simple mechanical Fourier analyzer, why such a complex software-controlled feedback loop, full of neurons which react in so many different ways to so many different kinds of amplitudes and frequencies and frequency differences? Clearly something more is going on in the ear than the simplistic mechanical Fourier transform which Doty, Worrall and Alves claim. My point here is not that Reinhard, Doty, Worrall and Alves were right or wrong: the big problem is that these posts (like so many others on this forum) purvey information dredged up from a single antique model of the ear/brain system as though it were *the ONLY* model of the ear/brain system. As though one single 1940-vintage hypothesis were the truth, the whole truth, and nothing but the truth. In addition, the posts by Mssrs. Reinhard, Doty, Worrall and Alves systematically avoid mentioning auditory phenomena which contradict their particular pet theory of hearing. The place theory of hearing is the *only* theory congenial to just intonation; thus Mssrs. Doty and Alves have a hidden agenda when they purvey this notion and neglect to mention the many psychoacoustic results which minimize the significance of just intonation, or militate against it. For example, Mssrs. Doty and Alves do not mention the results of Plomp & Levelt or Kameoka and Kuriyagawa, which demonstrate that consonance is a matter not of harmonic spectra but of partials separated by more than a critical bandwidth; moreover, K&K, von Bekesy, Johan Sundberg and John Pierce all point out that even in pure harmonic spectra intervals theoretically consonant can sound dissonant depending on sequence of overtones and temporal sequence of notes. "It became clear that the fifth was not always a consonant interval. A chord of two tones that consists of only odd harmonics, for example, shows much worse consonance at the fith (2:3) than at the major sixth (3:5) or some other frequency ratios. This was proved true by psychological experiments carried out in another institute (Sensory Inspection Committee in the Japan Union of Scientists and Engineers with a different method of scaling. Thus, the fact warns against making a mistake in applying the conventional theory of harmony to musical tones that can take on a variety in the harmnic structure." [Kameoka, A. and Kuriyagawa, M. "Consonance Theory Part II: Consonance of Complex Tones and Its Calculation Method," Journ. Acoust. Socc. Am., 45 (6), 1969, pg. 1460] etc., etc. While Doty, Alves, Canright and other have puffed up the aging and antiquated place theory as though it were the only one ever advanced to explain the operation of the ear, there are in fact 3 models of human hearing: the place theory ; the periodicity theory, and the learned-response (or software) model. Thus the place theory, put forward as THE model of human hearing by Doty, Worrall and Alves, is merely one of several. It has a lot of problems. There's plenty of data contradicting this model of the ear/brain system-- none of which has ever been mentioned in this forum by anyone but Your Humble E-Mail Correspondent. Even more surprising is that fact that *none* of these theories of human hearing is at all new. All 3 models of the ear/brain system were proposed within 5 years of each other--between 1841 and 1845. Moreover, experimental evidence was brought forward by proponents of each of these psychoacoustic models to confute the other paradigms. Thus, while some psychoacoustic experiments strongly support each model of human hearing, other evidence strongly contradicts it. For example, strong evidence for the place theory (ear as mechanical Fourier analyzer) is provided by "cocktail party effect," through which the ear easily manages to extract a single conversation from many overlapping sounds. Strong evidence against the place theory, however, comes from the Schouten/Seebeck missing fundamental effect (the ear's uncanny ability to "fill in" a fundamental that's been filtered out of a sound, which can only be explained if the ear is detecting the fundamental periodicity of the sound rather than its Fourier components). Strong evidence *for* the periodicity theory of hearing is provided by the observed neural coding of sounds as pulse-coded action potentials along the auditory nerve, measured via microelectrodes. However, strong evidence *against* the periodicity theory comes from the calculated width of the pulses on the auditory nerve--a datum leading inevitably to the conclusion that the ear cannot sense frequencies higher than 1600 hz due to the latency of the auditory nerve and the known limit of propagation of nerve impulses in the human nervous system (circa 240 mph). Since the ear is obviously sensitive to frequencies *higher* than 1600 Hz, the periodicity theory cannot be whole story either. Strong evidence for the Fetis/Ward/Burns theory of the ear/brain system as governed by learned response comes from the many musical cultures on this planet...many of which do not use harmonic timbres or just intervals. (Are the Javanese and the Balinese deluded? Or are their inner ears physically different from ours? There is no evidence for this, yet--as Marc Perlman has pointed out, no one has shown how Javanese or Balinese music can be described by small whole-number ratios. Ditto the musics of the Thais, of sub-Sharan Africa, of various Brazilian indian tribes, of the Mongolian herders and of indigenous peoples of Nepal and Tibet...ad infinitum.) Strong evidence *against* the Fetis/Ward/Burns model of the ear as molded primarily by learned response comes from the data showing that for some timbres Fourier analysis of the lower 6 harmonics does occur on the basilar membrane for at least some kinds of tones. And so the reality of psychoacoustics is a lot more complex than the simplistic claims made on this forum. One reason for this confusion has been the rate at which science has progressed over the last 70 years. Up to the late 1940s, researchers' understanding of the entire ear/brain system stopped at the physical level--the level of the basilar membrane. (These are the results, primarily from von Bekesy, cited by most of the subscribers to this forum. As we have seen, however, von Bekesy himself was well aware of the problems with this view of the ear.) Then, in the late 1940s, technology for the first time allowed researchers to insert microelectrodes into the auditory nerves of living creatures and study the pattern of nerve impulses generated by sound. Thus, while the view of researchers during the 1920s-1940s was that the ear was a purely physical system that performed a mechanical Fourier analysis of input sound waves, the view changed after the 1940s. Researchers who analyzed the pattern of nerve impulses in the ear/brain system began to emphasize the role of neural coding in human perception of sound. The application of the computer from the 1960s onward led to a third theory of the ear/brain system which emphasized the "software" aspects of the brain and the role of software-controlled feedback paths in the ear/brain system. Roger Shepard, Carol Krumhansl, Burns, Ward, and many others have advanced such models of the ear within the last few years. >From this standpoint it becomes clear why so many older textbooks are filled with so much outdated or simply incorrect information. It is less easy to explain why so many subscribers to this forum continue to blow the dust off moldy texts and cite obsolete hypotheses as though a majority of current researchers still believed them. With any luck, my upcoming series of posts will clear up some of this confusion, and replace citations of obsolete hypotheses & texts dating from the1940s with modern research and accurate data. --mclaren Received: from eartha.mills.edu [144.91.3.20] by vbv40.ezh.nl with SMTP-OpenVMS via TCP/IP; Sun, 24 Sep 1995 18:55 +0100 Received: from by eartha.mills.edu via SMTP (940816.SGI.8.6.9/930416.SGI) for id JAA11775; Sun, 24 Sep 1995 09:55:13 -0700 Date: Sun, 24 Sep 1995 09:55:13 -0700 Message-Id: <9509240954.aa14951@cyber.cyber.net> Errors-To: madole@ella.mills.edu Reply-To: tuning@eartha.mills.edu Originator: tuning@eartha.mills.edu Sender: tuning@eartha.mills.edu