Help for Tune Smithy Koch snowflake icon.gif

Analyse sound

From Tune Smithy

Revision as of 11:31, 27 July 2008 by WikiSysop (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Analyse sound

This page has help for Tasks | Analyse Midi voice and Tasks | Analyse Recording or Midi Voice .

Analyse Midi voice

Tasks | Analyse Midi Voice

This let's you play a note on your sound card or soft synth, and see its frequency spectrum when you stop play. You need a full duplex soundcard for this - one that can play and record at the same time. Luckily, many cards are full duplex nowadays, so you've got a reasonable chance that you can do this - just try the method as described. If it doesn't work, the chances are that you need to get a new sound card for your computer - or perhaps an external sound card if your computer has a USB port..

The strong point of this task in FTS, and the main reason it is included here, is that it uses various techniques to refine the measurement of the frequency of the partial to get particularly accurate pitch measurements However, it is not always so good at finding the peaks and discriminating them from noise, so it is best to look at the spectrum, and you may need to add / remove peaks by hand. Another reason for including this task is that it can be used to make custom voices from the partials.

Anyway this is what you do:

First click the Standard Settings button for this task.

Now use Record Control... . to show the volume controls for recording. What you see here depends on your soundcard. Select Midi , What you hear , or anything else that looks as though it will record the sounds played in FTS - which are played in Midi. Check that it isn't muted and that you have the volume set high enough, not at zero.

Choose How much of the sound to analyse .

Select a Midi voice to analyse using the Voices menu, then play it for a few seconds.The sound is automatically analysed when you click stop. Now click Show Freq . to show the frequency spectrum.

The blue dots show the partials. Add or remove dots using Ctrl + click or Ctrl + right click on the spectrum.

You will find that with the standard settings, very small fluctuations get ignored, as these are often the result of noise or short term inharmonicities in the attack, etc. To configure the way these get ignored use: Options | Finding peaks in the spectrum....

There are two ways of showing a spectrum - as linear amplitudes, which gives much sharper looking peaks, or as decibels, which corresponds to the way we hear sound, and makes the peaks look much broader. You can change between these from Freq analysis | Frequency spectrum | Options | What to show | decibels.

Frequency spectra are often used mainly for finding the partials of a single note. However, you can also use them to find the component pitches of a chord. If the notes are played using a harmonic timbre, then the harmonic series analysis may be helpful for this - see below.

To try out your new analysis, click Make Waveform from partials , which makes a new waveform out of pure frequencies (sine waves). Then click Play synthesised wave , and compare it with the original. You may like to click Volume envelope here, which makes the new waveform with the same volume envelope (attack and fade away at the end) as the original. This is particuarly useful for comparing percussion and plucked instruments with the original.

To show the values in your text editor, use Show partials as text file .

Now for a really fun part - you can make a custom voice from your analysis. For instance, suppose you analyse the oboe, and want to know what a glockenspiel would sound like playing those oboe partials. Well, you can do exactly this in FTS. Use Make partials into custom voice timbre . When the window pops up with the new custom voice, use Select Voice or non melod. perc. , and select the glockenspiel to play all the partials. Then it is ready to use and you will find it in Voices | Custom Voices , You can also select it into the highlighted part in the Parts window from Edit Custom voice as timbre | Select into Highlighted Part .

Auto record and analyse . - standard setting. The sound is automatically recorded when you click Play, and analysed when you click stop. Only works like this when you show one of the two tasks for analysing the sounds.

Note that if you make a custom voice from your partials, and select it into the hightlighted channel, then click the play button, you will of course now find the partials for your new custom voice.

If you want to hear your new custom voice and still keep the results of your previous analysis in the Frequency Spectrum window, unselect Auto record and analyse first.

Standard Settings - This has same effect as File | New with 0 0 0 for the Seed , to make repeating notes, search the entire waveform for the FFT (instead of searching selected detail), and the time for one note gets set to one second. The idea is that rather than play a fractal tune, you want to play a single repeated note to analyse.

Voices - select the voice you want to analyse from the voices menu (or non melodic percussion menu) - this button has same effect as Voice | Voices...

Note (secs) - If analysing a voice that dies away, like guitar or piano, you will want repeated short notes, otherwise, you can set this to some large value like 10000 secs.

Auto record and analyse - When selected, the sound is recorded whenever you play the sound using the main window play button for one of the sound analysis tasks, and analysed when you click stop.

How much to anlayse (secs) - How much of the recording to use for the analysis. Some lengths of time are more convenient for analysis than others. So, the actual length of the recording could be smaller or larger than this (up to a factor of two either way). It will use a little more than the amount you enter here if there is more of the recording available. Will use less, if there isn't enough of it to get to the next convenient amount of time for analysis. The analysis uses FFT (Fast Fourier Transform) - a method that needs a number of sample points which is a power of two.

Tuner - you can use this to check the frequency found, and compare it with the expected frequency.

Note that if you have a soundcard that uses wavetable sound, the frequency may well be a few cents sharp or flat overall for a particular midi voice, while the relative accuracy may be much better - on my SB live! soundcard the relative frequency varies by +-0.2 cents for many voices in the 8Mb bank, while the absolute pitch varies by up to +- 3 cents depending on the voice chosen, e.g. the ocarina is about two and a half cents flat and the flute is about two and a half cents sharp.

Show partials as text file - Shows a list of all the partials found as frequencies, decibels, and cents values from the lowest freq. A decibel is a relative measure, defined in terms of the volume relative to a typical background sound level, so you could add or subtract a constant to all the volumes corresponding to playing the voice louder or softer (e.g. louder or softer on your speakers, or whatever). The values are scaled so that the maximum amplitude is shown as 100 decibels.

FTS can read this list of partials back in again - it does it by looking for any lines beginning with a numeral (0 to 9, + or -). So you can edit it and add new partials to it.

E.g. to add a partial of 80 decibels at 440 hz., add the line:

440 80

You don't need to give cents values - FTS will only look at the hertz values when it reads the file.

This button saves the partials as partials.tmp.tbr . This is just a temporary file, and time you use the button, the old version gets over-written. To keep them for further reference, save it again under some other name, If you use the extension .tbr you will be able to find it using File | Open | Files of type | Timbre partials (*.tbr) .

Or, you can save them from the main window using File | Save As | Files of type | Timbre partials (*.tbr) .

Add harmonic series analysis - looks for harmonic timbres in the list of partials. This will be particularly effective for timbres with many partials (such as strings). Uses the higher partials to adjust the frequency of the fundamental. This analysis is for information only; it is ignored by FTS when the file is read in again.

More - Here you can set limits on the amount of memory that can be used for the recording or for FFT. You can choose to zero pad instead if there isn't enough to go up to the next power of two. This is memory rather than disk space - FTS does all the work in memory - which is probably sufficient for analysing short to medium length recordings to find the FFT.

Find Freq. - Use this to do a new analysis of the same recording, e.g. if you change How much to anlayse (secs) .

The Frequency spectrum window

Show Freq - This shows the analysis as a frequency graph. The dots show the frequencies found.

The standard setting is to show decibels vertically and the log of the frequency horizontally - this corresponds to the way we hear sound - in terms of decibels for volumes, and in terms of intervals for pitches - e.g. all octaves sound the same size to us.

One may be used to seeing a frequency spectrum with (linear) amplitudes vertically, which is another commonly used format. Showing the amplituded instead of decibels gives much narrower peaks, almost like vertical lines. To show that kind of frequency plot, unselect FFT | Options | Decibels .

One may also be used to see frequencies instead of the logs of the frequencies horizontally - to show it that way, unselect FFT | Options | log plot frequency axis . With this type of plot, a harmonic series will be shown as equally spaced peaks.

To look in at a section in more detail, use click and drag to highlight it and a detail view will pop up. Sometimes FTS will find the peaks fairly easily, but sometimes there may be a fair degree of choice needed about which peaks to count as separate frequencies and which to ignore. In that case one will probably want to be able to select them oneself.

To remove partials, use Ctrl + Rt click on the dots. To add new ones in, use Ctrl + left click above the peaks. The way it works is that if you Ctrl + Rt click anywhere, you remove the nearest dot to the click point, and if you Ctrl + left click, you add a dot to the nearest point on the graph to the click. So, Ctrl + click below the graph will add a point to a valley instead of a peak.

You can use the zoom in and out and left and right menu options to change the detail view. Also, Shift + right click on a point in the detail view to expand it so that point goes to the right margin, and shift + left click to expand the view so that the click point goes to the left margin.

You can tweak the various parameters in FFT | Options | Frequency detection (see the FFT Options menu).

Analyse Recording

Tasks | Analyse Recording or Midi Voice

Choose Standard Settings first. Then, Recording device and vol . See [#analyse_midi_voice Analyse Midi voice] for more about them:

Use Open Audio file You can open any file in .WAV format directly. If you try one in another audio format such as .mp3, then you will be asked if you want to play it by file association, and record the playback.

To analyse from the beginning of the clip, set the How much to analyse (secs) to the amount you want to analyse, click Find Frequencies , then Show Frequencies to see the frequency spectrum. The blue dots show the peaks you have found automatically.

To edit the partials, use Ctrl + Rt click on any dots you want to remove, and Ctrl + left click above any peaks. you want to add in. For more about this window, wee the end of the previous section - the [#freq_spectrum_window Frequency spectrum window] .

Now you can try it out using Make Waveform from partials and Play synthesized wave , to see if it sounds like the original. Show partials as text file , and Make partials into custom voice are as in the previous section [#analyse_midi_voice Analyse Midi voice] , as is the method of recording automatically when you play a voice.

Show Recording shows your waveform.

To find the frequency analysis for a detail, select a region of the recording using click and drag. Then select Analyse detail , and then Find Frequencies etc. as before.

Note that the recording you open using Open Audio file or record using the Start Rec. button from this window are recorded directly to RAM. To record to a .WAV file in FTS, use Bs | Record To File Options | Record to file which records directly to a file. Or save your RAM recording to file from Recording | Options | Save Recording As . The RAM recording is of course limited by the amount of memory you have, while the save directly to file is limited only by the free disk space (usually much larger). You set a maximum amount of memory for the RAM recording from Recording Options | Times | Max duration for sound buffer .

Partials and timbres

If you pluck or bow a string, and it isn't too tightly stretched, then the note itself sounds, and also simultaneously with the basic note you will get double the frequency, three times it and so on. These are the partials, or component frequencies that make up the note. They all add together to contribute to the sound.

The way the amplitudes of these partials vary from one instrument to another is one of the things that gives each instrument its own unique timbre.

For instance, the harpsichord has the third harmonic stronger than any of the others, while most instruments have the second harmonic as the strongest ones. A few have the first harmonic strongest, such as recorder, at least in its lowest register.

Strings have many high harmonics. A flute has very few and an ocarina pretty closely approximates a pure sine wave. Voice, and oboe are in between the two.

The clarinet has strong odd numbered harmonics and weak even numbered ones up to about the seventh harmonic:
1 (2) 3 (4) 5 (6) 7 ...

If the string is very tightly stretched, as in a piano, then the higher partials are sharp, increasing in pitch by so many cents per octave.

For other timbres, sometimes some of the partials may be a little flat compared with the fundamental.

These aren't just theoretical things - if you listen intently to a note played on a 'cello say, you can learn to pick out the component partials. I have a midi clip that one can use to do this as an exercise here - see the [Scales_and_Fractal_Tunes.htm#Newbie_notes Newbie notes]

One may also be interested to read David Canright's article on the harmonic series:

When the component pitches of the timbre form a harmonic series in this way - the original pitch, then double it, three times, and so on - then the timbre is said to be harmonic.

It turns out that many instruments used in music have harmonic timbres, or near harmonic ones. However, bells, gongs and such like instruments have component frequencies that are nowhere near any kind of a multiple of the basic note, and these are called inharmonic timbres.

Makers of church bells actually try to get all the partials as close as possible to a harmonic series, but find they have to have a minor third instead of a major third, which contributes to the characteristic church bell sound. Church Bells also have doublets often - two close together partials that beat with each other - one often hears those beats in a bell sound, or prob., several beating partials at different speeds simultaneously. This is almost part of the characterstic bell sound too, though apparently the bell makers do their best to remove doublets. For more about this, explore the The Sound of Bells site.

A composer may sometimes choose to make a special timbre designed to sound good in exotic scales, with the partials adjusted to be in tune with them, e.g. adjust the third partial so that it is in tune with thirteen equal or eleven equal or whatever scale is of particular interest.

Another approach is to choose a bell / gong with inharmonic timbres, and then use that as the basis for making a new scale that will sound good when playing tunes with that particular instrument - design the scale for the timbre.

Music with scales based on inharmonic timbres, or vice versa:

Jacky Ligon - Galunlati .

Bill Sethares - Xentonality cd

Frequency spectrum Options

Tasks | Analyse Midi voice | Show Frequencies | Options...

Min Freq. to find and to show - when doing an FFT, sometimes one will get unwanted partials that are too low. Minimum is preset to 15 hertz which is so low that it is most likely felt rather than heard.

Max freq. to show . , and max, to 8800 hertz, which is d'''' flat, and higher than most musical instruments ever go in normal circumstances. This is the highest one to show in the window, but not the highest one to find - there is no limit there.

Show as bar charts .

Detection of partials in the spectrum

Tasks | Analyse Midi voice | Show Frequencies | Options | Detection of partials...

Windowing - If your wave happens to repeat exactly at the end of the sample, then there is no need for windowing. However, one needs to be pretty lucky for that to happen. Most often, the sample starts and ends in the middle of a wave.

Now, one could just truncate the sample to the start of the wave, however, Fast Fourier Transforms (the method used to find the spectrum) is much faster if the number of samples used is a power of two. So, there are certain lengths of time that are best to use for FFT, such as 1.486 secs at 44100 hz, for instance. So one usually uses those.

If the sample breaks off in the middle of a wave, then this cut off is detected as a frequency itself. It's treated like a section of a square wave. If the entire sample is only 1.486 seconds, your wave cut off is indistinguishable from a low frequency square wave of frequency 1.486 seconds. The FFT will find not just that square wave, but its overtones too, and they will be added into the results. Probably these partials will be quite low amplitude, but it's enough to make the curve a little less clear.

So, the solution is to use windowing. The idea is that you gradually fade the sound away to zero at either end. Then, the simplest method is triangular - you keep the centre point of the sample at its maximum volume, and then just fade away to either side in a linear fashion. The other windowing methods in the menu are various curves that theorists have produced that work particularly well with FFT. Try them all and see which works best.

Peak interpolation method - to see this in action, select FFT Options | Show as bar chart . Then click + drag on the FFT window to show a detail view, and zoom in to one of the peaks. Try choosing different peak interpolation methods, and click Re-find peaks, and see how the dot changes position.

None = places the dot centrally.

Most of the other methods - looks at the two points to either side of tHE maximum value, and uses those to estimate whether the peak is to the left or right of the central point. For instance, if you have two points almost exactly at the same height, and then one much lower, then clearly the peak is almost exactly half way between those two points. If you have one point that is much higher than the ones to either side, then the peak is at that point (i.e. in the middle of the bar in the chart), and so on. So the basic principle is pretty straightforward. The drop list just shows some of the best methods found by theorists - ones that work well at estimating the correct value for the peak.

The Quinns estimators can't be used if you have Real FFT (Oora's implementation) selected. You can still choose them, but they will use Jain's method for as long as you continue to use real FFT.

Also, the Quinn's estimators need some information that FTS discards after the calculation (the phase information). So, you need to use Find Freq instead of Re-find peaks with these.

I find Jain's method is often the best for musical notes.

Web page with more info about these algorithms: How to interpolate the peak location of a DFT or FFT if the frequency of interest is between bins

Mean value - this is a new experimental idea. However, it is now the standard setting as it seems that it gives better pitch detection of partials than even Jain's method, which is three point.

The idea is that you think of the FFT values near a peak (e.g. down to 10 percent of the height of the peak) as a number of votes for that particular frequency. So to find the total, you multiply each frequency bin by the number of votes for that frequency. Add the results for all the bins for the peak. Then at the end, divide by the total number of votes to get the mean value.

You also need to do a bit extra for the left and right ends of the region. One needs to cope with the fact that if you join the points (e.g. using straight lines as 1st approx) then the point at which the curve will cross the 10% value will usually be between two data points.

If you look at a peak in detail you often see a number of smaller pinacles on the top, and often the highest of these is off-centre. Sometimes there may be two of them to either side of the centre. The other methods will just find the highest of these, whatever its position. With the mean value method, if you go down far enough to either side of the peak, you often end up with a value between the two pinacles, in the position that one would choose by eye for the main peak. We are talking about very small details here - sometimes tiny pinnacles on maybe a quite broad plateau comparatively speaking. So they are not ones that would be perceived as doublet partials, just a broadening of the pitch of the partial.

Mouse pos for freq. and Mouse pos.for freq. and amp. These let you re-position the peak by eye - sometimes useful, especially if the main peak in the spectrum has tiny summit pinnacles with the highest of them off-centre (perhaps the highest two of them to either side of the centre).

What percentage of max value to go down to at either side of peak - standard setting is 10 %. Used with the Mean value option. The lower this is, the more of the peak you take account of; when high, you will find the highest pinnacle in the peak even if it is off-centre. When you select Show peak detection curves for max. amp. and highest freq. partials , then you can see how far down the peak the mean value search goes by a horizontal line below the dot for the peak. The region of the peak that lies above that level is the part taken account of to find the Mean value. Sometimes when the peaks for partials are close together, there may be no valley between them deep enough to reach the mark. When this happens, then the mean value search will only go down as far as the deepest valley between the two partials - and it will go down the same distance to eiher side of the peak to keep the symmetry.

Ignore small peaks - unselect this if the signal has clear partials with little in the way of noise, short lived inharmonics, or multiple close together partials. Select it if the timbre has many very small peaks to ignore.

Test with ten random partials . This tests the FFT by making a waveform using randomly generated partials. So the frequencies used are known, and you can compare them with the actual values found. The waveform is made for the time shown in Analyse Midi voice | How much to anlayse (secs) . So you can try varying the length of the clip, and see how increasing the size of the sample improves the accuracy. To see what values were used to make it, highlight part of the FFT, and look at in the detail view. Shows hertz, and the values in brackets are the amplitudes in decibels. Also shows the differences in cents between the pitches used to make the clip and the pitches found by FFT.

The original waveform is made espeically for the FFT and discarded immediately, but you can listen to the resynthesised version of it using Analyse Midi voice | Make waveform from partials , and Play synthesized voice . Random partials can make a sound like a gong because of the lack of any harmonic series - see Chris Bailey's modelling of the sound of a bicycle spoke using random partials.

Test with sine wave - this tests the FFT using the frequency from the Pitch window to make the waveform.

Find FFT - use this to find the FFT again from scratch.

Re-find peaks - use to search to find the peaks in the FFT curve - e.g. after changing some of the other parameters.

If you look at the FFT for a real timbre in detail you'll see dozens of peaks of varying size. So the question is, which ones are significant, and which are not? If you were to add up all the frequencies found - not just the peaks, but the valleys too, everythign in the curve, then you would get the original wave exactly in all its detail - except that the phase information has been lost in the process of doing the FFT - if one kept that too, one would have the waveform exactly, because that is how fourier transforms work.


However, that's quite impractical - we want to find a small set of partials, maybe a dozen or so; certainly not thousands of them. So the question is, which ones to keep. Obviously the largest one found has to be kept, but which of the smaller ones? Some of the smaller ones may also be FFT ripples - a kind of ripple effect caused by the windowing (they reconstruct the fade out at the start and end of the sample) - we'd like to leave those out, and they often appear as a sequence of smaller peaks to either side of a main peak..

Well, one idea is to leave out any that are very close to ones already found and significantly smaller. Then, one could also leave out any very quiet ones that are just above the background sound level. Also, one could leave out any that are sort of small enough so that they get lost in the jumble of peaks and valleys and don't rise much above them.

So, before tweaking the rest of the parameters, let's show what is happening on the graph.

Show peak detection curves for max. amp. and highest freq. partials - This shows them for two of the peaks found - the maximum amplitude one and the highest frequency one. To see the curves for any of the other peaks, highlight it, and look at it in the detail view.

Peak frequency bandwidth - Changes the two curves you see for each peak. So for example, the standard setting of 40 Hz for 20 DB means that the two curves have to go down to 20 Db below the peak at a frequency of +- 40 Hz to either side of the peak.

Maths details: one of the curves is bell shaped, - it's a gaussian, and the other is more parabola like - it's a 1/x type law, more generally, it's an inverse power law with the power chosen so that it will reach the desired Db level at the desired frequency. The inverse power is symmetrical, though it t mightn't look it if you use the standard log plot frequency axis.

Use Gaussian of the log of the freq - uses it instead of the Gaussian of the freq for peak detection. Gives a slightly flatter curve around the peak and generally follows the typical shape of the FFT partial peak more closely. Also uses the inverse power of the log of the differences for the inverse power curve.


Tasks | Analyse Midi voice | Oscill...

This shows the wave as it is recorded. Click Start to start the recording. Stop to stop it. Play this will play the oscilloscope waveform in a loop - i.e. the actual wave showing in the window. Play all will play the complete recording.

Oscilloscope Options

Tasks | Analyse Midi voice | Oscill. | Oscill. Options...

Ctrl + Play this will play the oscilloscope waveform as pitch glissandi so that you can hear

what the shape is (esp. for blind users). Plays the shape of the waveform that you hear if you click Play this . To set the length of time for the pitch glissandi: Oscilloscope | Options | Length of clip for waveform pitch glides (secs) .

Recorded Sound

This window shows your recording in temporary memory. This is the recording that gets used for frequency analysis, or if you want to find notes in a recording.

It is limited by the amount of RAM you have so you can only use this to record short clips. To make longer clips use the main window Rec. button, which records to the file name set using Bs | Record to File .

Click Start Rec. to start the recording, and Stop Rec. to stop it. If you get an error message when you start the recording, check to see if you have any other audio programs open.

Note that this recording is only temporary at this point - it will get lost when you exit from FTS. You save your recording from Options | Save Recording As .

To zoom in on a detail of the recording, click and drag to highlight it. A small detail window will pop up with menu options to zoom in / out or move the highlight left or right. You can also zoom in further within the detail view by using Shift + left click and Shift + right click to either side of the detail of interest.

You can use Ctrl + click to insert a marker into the recording or Ctrl + right click to remove the nearest marker. These markers get used with the Find Seed option.

Recording Options

Here you can choose the recording type and the devices for recording with. This includes the option to record in stereo - however later if you want to analyse the waveform to find the spectrum, you will be asked if it is okay to reduce it to mono as FTS finds the spectrum for a mono waveform.

Fill in waveform - fills in the interior of the waveform - otherwise it gets shown in outline. You can show the waveform with various fill styles - Normal is what one will usually want and the others are just for artistic effect - try them out to see what the do. The usually have a style parameter which varies the amount of the effect. In most cases you need to zoom in on a detail of the recording to see the effect.

Trim region to highlight - to use this, show the Recording (temp), higlight the region you want to trim to, and use Trim region to highlight.

Find region to Trim - this is used to trim silences from the start and end of the recording.

It works by first finding the quiest part of the waveform. This is used as the background sound level, and anything as quiet as that or quieter gets trimmed, or anything just a little louder than it.

The preset setting here is:

Background Sound for Trim 200 % of quietest 0.2 seconds .

This will work fine with these values in most cases. However, to explain it,: the 0.2 seconds obviously needs to be shorter than the largest silence that you have at start or end (or possibly middle) or your sound clip. The 200 percent here means, trim anything that is at most double the background sound level as found for the average sound level in the quietest 0.2 seconds.

Then set the trim margin - at the preset values you have a trim margin of 0.2 seconds which means that you get 0.2 seconds of silence left at either end of the recording after the trim (where possible of course).

Now click Find Region to Trim . The trim region will get highlighted in the recording. Review it to check that it looks okay, then click the Trim Recording to current highligh t button.

Recalibrate 0 pos if nec. Some soundcards have a slight bias in the zero position of the waveform - this means you get more of the wave to one side of it than to the other. This is often easy to spot when there is no sound playing as you may find that background noise is shown always to one side of the zero line, always above or always below.

You can see the amount of the compensation below this check box. The correction is done on the basis that a physical waveform would be equally balanced with as much of the wave to either side of the zero line. This affects how the waveform is displayed in the oscilloscope etc, the FFT analysis, and the zero crossings wave count.

Counting wave crossings

Tasks | Analyse Midi voice | Oscill..

This is about how the method works. For details of how to use it go on to the [#find_seed Find seed] section.

The Oscilloscope finds the frequency from a single wave as the standard setting. So for instance, if a complete wave lasts for 1/440 of a second, then one can calculate that the frequency is 440 Hz.

So, what level of accuracy can one expect for a measurement of a single wave at 440 Hz?.

Suppose first that you measure the duration to the nearest sample point. Then, the wave lasts for about 100 samples, so one would expect to get the frequency accurate to within +- 0.5 percent. One might measure 444.2 Hz instead of 440 Hz - out by 16 cents. That's not so bad considering it is a single wave.

However, one can do a lot better than this. Though we have quite limited time resolution, we have very good amplitude resolution, especially with 16 bit sound.

So the idea is to look at points to either side of the zero crossing where the wave switches over from the positive to the negative side. If you zoom in on the crossing point, then many waves are prettty close to linear when they cross the zero point. So we just join the sample points before and after the crossing with a sloping line, and look to see where it crosses the zero point. In this way we can interpolate to get the time for the zero crossing to far better precision than the sample rate might suggest was possible.

If the wave crosses the zero at a slope and is reasonably linear there, then typically, this gets the pitch is accurate to about a cent or two for a single wave, instead of 16 cents.

This is much better than one can do with a spectrum analysis with just 100 sample points as we have here. Indeed, this is more like the pitch accuracy one can achieve with about a second or so of spectrum analysis if one did it without peak interpolation. It is equivalent to spectrum analysis for maybe a tenth of a second with peak interpolation.

I tested the accuracy of the measurements in FTS for short notes using the FM7 to generate test notes - as other measurements done with long clips, so very accurate, show that the FM7 is extremely accurate pitch wise.

Here is five equal played with one second notes on the FM7 init edit voice - a pure sine wave - measured in FTS using wave counting.

240.008 cents 480.005 cents 720.003 cents 960.002 cents 1200 cents

As you see the pitch measurement is accurate to less than a hundredth of a cent.

Here is a just intonation twelve tone scale, again played on the FM7 as a sine wave, this time with notes of only 0.2 seconds, and measured in FTS using wave counting.

203.916 cents 386.322 cents 498.039 cents 701.957 cents 884.363 cents 1088.27 cents 1200 cents

The numbers for that one should be

203.91 cents 386.314 cents 498.045 cents 701.955 cents 884.359 cents 1088.27 cents 1200 cents

Still accurate to better than a tenth of a cent (maximum error 0.08 cents), which is way beyond what one would normally expect for notsSo as you see, wave counting can be very accurate indeed for suitable timbres!

N.B. sometimes you see it said that there is no way to improve on FFT pitch detection because of the classical uncertainty principle of frequency analysis (which is related to the quantum mechanics uncertainty principle).I have been told sometimes that these measurements seem to be too accurate to be possible given the number of samples measured. I don't know enough about these arguments to be able to give the answer to that.

There is no doubt however, that the wave count method works. So - well I leave it to the experts in the field to explore how this method has managed to get around what seem to be limitations to frequency pitch detection. Maybe we are making some assumption here - by assuming that the waveforms we are measuring are steady in pitch when they aren't played for long enough to be able to measure the pitch - but if so - it is a reasonable assumption if we know that the pitches are played by a musical instrument.

The method accurately predicts the pitch of a longer note when you give it e.g. just the first fifth of a second - predicts the pitch to within a tenth of a cent at 440 Hz based on what may seem to be too short a section to be able to do what it does. So - it must be getting something right somehow or other. I'm interested to know if anyone can shed more light on this subject.

Incidentally, another approach is to do FFT of a harmonic timbre and take an average of all the frequencies in the spectrum to estimate the base frequency, and this also can improve the accuracy considerably over FFT even with peak interpolation, for suitable timbres. You can experiment with this too in FTS from Tasks | Analyse Midi Voice | Add harmonic series analysis , with a strongly harmonic timbre such as strings.

Generally, the wave counting method gives better pitch accuracy than FFT if one can use it.

However, it has limitations as well - it isn't a method for distinguishing partials in a sound. It only detects the basic frequency of the note. This means that if you have several parts playing at once, it will only detect one of them - usually the loudest, or the one with the most prominent or clearest waveform.

If you have several instruments sounding - if they are very pure in sound, you may be able to remove the unwanted ones using a bandpass in a sound editor such as GoldWave or CoolEdit etc.

You should remove any noise from the recording if you can.

It is particularly a good idea to remove any low frequencies as these can cause the waves to undulate up and down, sometimes even so far that some waves miss the zero line altogether. FTS won't be able to analyse such recordings very well.

It also works best if the wave is reasonably regular in shape - sine waves, saw tooth, or any simple repeating shape is best of all, and especially if it crosses the zero point just twice per wave cycle. The more linear it is as it crosses, the better. If there are secondary peaks there are settings that can deal with that.

That's not so bad as it seems - as harmonic timbres give you fairly regular shaped waves, so musically useful sounds have somewhat of a tendency to be reasonably amenable. But it is best with the purer sounds - the ones most like sine waves.

The ocarina voice on a soundcard is pretty close to sine wave usually, so it's a good choice for the best possible pitch accuracy for the wave counting method.

It also works very well for many types of bird song as they are close to sine waves.

Recorder works pretty well. Flute also works, some voice timbres - experiment and see which works for you. If this method isn't suitable, one can fall back on FFT.

If the waveform is a suitable one, then the pitch accuracy depends on the volume of the sound, as the louder the note, the better the volume resolution in samples. A regular wave is often at its steepest at the crossover point, and in that case, typically the step in amplitude from one sample to the next is going to be about 10 in a reasonably loud note. This means that the time for the zero crossing point will be measured accurately to within a tenth of a time step or so, depending on the zero crossing step size.

Technical note

The result is an extremely accurate pitch measurement. In optimal conditions - a note of at least a fifth of a second recorded at a sample rate of 44 Khz and loud enough to have a minimum zero crossing step size of 10 will be measured with an error of at most +- 0.04 cents (the cents value for the ratio 88002/88000 which is the maximum possible ratio between the interpolated time and the exact time assuming maximum error at both ends of the wave train.).

Note also this calculation gives the maximum error that can occur, not an average. If your measurments of a waveform show that it satisfies the conditions given here, then all the pitch measurements you make of notes of steady pitch of duration a fifth of a second or more will be within 0.04 cents of the true value.

If the note is a bit quieter or the waves slope more gently at the crossing point, so that the minimum zero crossing step in amplitude is 4 (say) instead of 10, more typical, then the pitch will be accurate to within +-0.1 cents maximum error, still tiny for a fifth second note.

Find Seed

Bs | Find Seed from recording .

First, here is a quick summary of how to play a phrase, record it, and make your recording into a seed for FTS.

First check you are set to record from the microphone from the Vol / Select menu option.

Click Start recording . Play the seed, then stop the recording. Finally use the Find & -> Seed button. You will now hear your seed played back, and you can compare it with the original.

To play back your original recording click Play .

You may want to quantise the notes of the seed to a particular arpeggio, which you can use from Options for -> Seed | quantise the pitches to arpeggio . To use this option, you need to set up the arpeggio you want in the main window. Then when you click the -> Seed button, you will get the arpeggio note closest in pitch to each of the notes you played.

Re-find or find marked notes . Sometimes you may get too many notes found. In that case, you can remove the extra notes and refind them. Show the Recording (temp)... Now using Ctrl + left click to add markers, and Ctrl + right click to remove them, you can set the boundaries of the notes where you like (show the help of this window for details).

Then when ready, click this button to re-find the notes.

Search detail - this finds the notes just in the detail window that shows up if you click / drag on the recording to highlight a region of it.

Find as single note - finds a single note for the entire recording based on the average pitch..

Record to Midi on -> Seed The idea of this is that it can be used to convert your recording to midi format as a form of Wave to Midi conversion. Select Play on -> seed as well. Then choose a file name to save your midi file to from Bs | Record To File Options . Finally, use Find Seed as before. FTS will make a midi clip from your recording as it plays the seed.

Personal tools
How to use the wiki