Help for Tune Smithy
From Tune Smithy
This page has help forand .
Analyse Midi voice
This let's you play a note on your sound card or soft synth, and see its frequency spectrum when you stop play. You need a full duplex soundcard for this - one that can play and record at the same time. Luckily, many cards are full duplex nowadays, so you've got a reasonable chance that you can do this - just try the method as described. If it doesn't work, the chances are that you need to get a new sound card for your computer - or perhaps an external sound card if your computer has a USB port..
The strong point of this task in FTS, and the main reason it is included here, is that it uses various techniques to refine the measurement of the frequency of the partial to get particularly accurate pitch measurements However, it is not always so good at finding the peaks and discriminating them from noise, so it is best to look at the spectrum, and you may need to add / remove peaks by hand. Another reason for including this task is that it can be used to make custom voices from the partials.
Anyway this is what you do:
First click thebutton for this task.
Now use Midi , What you hear , or anything else that looks as though it will record the sounds played in FTS - which are played in Midi. Check that it isn't muted and that you have the volume set high enough, not at zero.. to show the volume controls for recording. What you see here depends on your soundcard. Select
Select a Midi voice to analyse using themenu, then play it for a few seconds.The sound is automatically analysed when you click stop. Now click . to show the frequency spectrum.
The blue dots show the partials. Add or remove dots using Ctrl + click or Ctrl + right click on the spectrum.
You will find that with the standard settings, very small fluctuations get ignored, as these are often the result of noise or short term inharmonicities in the attack, etc. To configure the way these get ignored use:.
There are two ways of showing a spectrum - as linear amplitudes, which gives much sharper looking peaks, or as decibels, which corresponds to the way we hear sound, and makes the peaks look much broader. You can change between these from Freq analysis | Frequency spectrum | Options | What to show | decibels.
Frequency spectra are often used mainly for finding the partials of a single note. However, you can also use them to find the component pitches of a chord. If the notes are played using a harmonic timbre, then the harmonic series analysis may be helpful for this - see below.
To try out your new analysis, click, which makes a new waveform out of pure frequencies (sine waves). Then click , and compare it with the original. You may like to click Volume envelope here, which makes the new waveform with the same volume envelope (attack and fade away at the end) as the original. This is particuarly useful for comparing percussion and plucked instruments with the original.
To show the values in your text editor, use.
Now for a really fun part - you can make a custom voice from your analysis. For instance, suppose you analyse the oboe, and want to know what a glockenspiel would sound like playing those oboe partials. Well, you can do exactly this in FTS. Use. When the window pops up with the new custom voice, use , and select the glockenspiel to play all the partials. Then it is ready to use and you will find it in , You can also select it into the highlighted part in the window from .
. - standard setting. The sound is automatically recorded when you click Play, and analysed when you click stop. Only works like this when you show one of the two tasks for analysing the sounds.
Note that if you make a custom voice from your partials, and select it into the hightlighted channel, then click the play button, you will of course now find the partials for your new custom voice.
If you want to hear your new custom voice and still keep the results of your previous analysis in thewindow, unselect first.
0 0 0 for the , to make repeating notes, search the entire waveform for the FFT (instead of searching selected detail), and the gets set to one second. The idea is that rather than play a fractal tune, you want to play a single repeated note to analyse.- This has same effect as with
- select the voice you want to analyse from the voices menu (or non melodic percussion menu) - this button has same effect as
- If analysing a voice that dies away, like guitar or piano, you will want repeated short notes, otherwise, you can set this to some large value like 10000 secs.
- When selected, the sound is recorded whenever you play the sound using the main window play button for one of the sound analysis tasks, and analysed when you click stop.
- How much of the recording to use for the analysis. Some lengths of time are more convenient for analysis than others. So, the actual length of the recording could be smaller or larger than this (up to a factor of two either way). It will use a little more than the amount you enter here if there is more of the recording available. Will use less, if there isn't enough of it to get to the next convenient amount of time for analysis. The analysis uses FFT (Fast Fourier Transform) - a method that needs a number of sample points which is a power of two.
- you can use this to check the frequency found, and compare it with the expected frequency.
Note that if you have a soundcard that uses wavetable sound, the frequency may well be a few cents sharp or flat overall for a particular midi voice, while the relative accuracy may be much better - on my SB live! soundcard the relative frequency varies by +-0.2 cents for many voices in the 8Mb bank, while the absolute pitch varies by up to +- 3 cents depending on the voice chosen, e.g. the ocarina is about two and a half cents flat and the flute is about two and a half cents sharp.
- Shows a list of all the partials found as frequencies, decibels, and cents values from the lowest freq. A decibel is a relative measure, defined in terms of the volume relative to a typical background sound level, so you could add or subtract a constant to all the volumes corresponding to playing the voice louder or softer (e.g. louder or softer on your speakers, or whatever). The values are scaled so that the maximum amplitude is shown as 100 decibels.
FTS can read this list of partials back in again - it does it by looking for any lines beginning with a numeral (0 to 9, + or -). So you can edit it and add new partials to it.
E.g. to add a partial of 80 decibels at 440 hz., add the line:
You don't need to give cents values - FTS will only look at the hertz values when it reads the file.
This button saves the partials as partials.tmp.tbr . This is just a temporary file, and time you use the button, the old version gets over-written. To keep them for further reference, save it again under some other name, If you use the extension .tbr you will be able to find it using Timbre partials (*.tbr) .
Or, you can save them from the main window using Timbre partials (*.tbr) .
- looks for harmonic timbres in the list of partials. This will be particularly effective for timbres with many partials (such as strings). Uses the higher partials to adjust the frequency of the fundamental. This analysis is for information only; it is ignored by FTS when the file is read in again.
- Here you can set limits on the amount of memory that can be used for the recording or for FFT. You can choose to zero pad instead if there isn't enough to go up to the next power of two. This is memory rather than disk space - FTS does all the work in memory - which is probably sufficient for analysing short to medium length recordings to find the FFT.
- Use this to do a new analysis of the same recording, e.g. if you change .
The Frequency spectrum window
- This shows the analysis as a frequency graph. The dots show the frequencies found.
The standard setting is to show decibels vertically and the log of the frequency horizontally - this corresponds to the way we hear sound - in terms of decibels for volumes, and in terms of intervals for pitches - e.g. all octaves sound the same size to us.
One may be used to seeing a frequency spectrum with (linear) amplitudes vertically, which is another commonly used format. Showing the amplituded instead of decibels gives much narrower peaks, almost like vertical lines. To show that kind of frequency plot, unselect.
One may also be used to see frequencies instead of the logs of the frequencies horizontally - to show it that way, unselect. With this type of plot, a harmonic series will be shown as equally spaced peaks.
To look in at a section in more detail, use click and drag to highlight it and a detail view will pop up. Sometimes FTS will find the peaks fairly easily, but sometimes there may be a fair degree of choice needed about which peaks to count as separate frequencies and which to ignore. In that case one will probably want to be able to select them oneself.
To remove partials, use Ctrl + Rt click on the dots. To add new ones in, use Ctrl + left click above the peaks. The way it works is that if you Ctrl + Rt click anywhere, you remove the nearest dot to the click point, and if you Ctrl + left click, you add a dot to the nearest point on the graph to the click. So, Ctrl + click below the graph will add a point to a valley instead of a peak.
You can use the zoom in and out and left and right menu options to change the detail view. Also, Shift + right click on a point in the detail view to expand it so that point goes to the right margin, and shift + left click to expand the view so that the click point goes to the left margin.
You can tweak the various parameters in(see the menu).
Choosefirst. Then, . See [#analyse_midi_voice Analyse Midi voice] for more about them:
Use .WAV format directly. If you try one in another audio format such as .mp3, then you will be asked if you want to play it by file association, and record the playback.You can open any file in
To analyse from the beginning of the clip, set theto the amount you want to analyse, click , then to see the frequency spectrum. The blue dots show the peaks you have found automatically.
To edit the partials, use Ctrl + Rt click on any dots you want to remove, and Ctrl + left click above any peaks. you want to add in. For more about this window, wee the end of the previous section - the [#freq_spectrum_window Frequency spectrum window] .
Now you can try it out usingand , to see if it sounds like the original. , and are as in the previous section [#analyse_midi_voice Analyse Midi voice] , as is the method of recording automatically when you play a voice.
shows your waveform.
To find the frequency analysis for a detail, select a region of the recording using click and drag. Then select, and then etc. as before.
Note that the recording you open usingor record using the button from this window are recorded directly to RAM. To record to a .WAV file in FTS, use which records directly to a file. Or save your RAM recording to file from . The RAM recording is of course limited by the amount of memory you have, while the save directly to file is limited only by the free disk space (usually much larger). You set a maximum amount of memory for the RAM recording from .
Partials and timbres
If you pluck or bow a string, and it isn't too tightly stretched, then the note itself sounds, and also simultaneously with the basic note you will get double the frequency, three times it and so on. These are the partials, or component frequencies that make up the note. They all add together to contribute to the sound.
The way the amplitudes of these partials vary from one instrument to another is one of the things that gives each instrument its own unique timbre.
For instance, the harpsichord has the third harmonic stronger than any of the others, while most instruments have the second harmonic as the strongest ones. A few have the first harmonic strongest, such as recorder, at least in its lowest register.
Strings have many high harmonics. A flute has very few and an ocarina pretty closely approximates a pure sine wave. Voice, and oboe are in between the two.
The clarinet has strong odd numbered harmonics and weak even numbered ones up to about the seventh harmonic:
1 (2) 3 (4) 5 (6) 7 ...
If the string is very tightly stretched, as in a piano, then the higher partials are sharp, increasing in pitch by so many cents per octave.
For other timbres, sometimes some of the partials may be a little flat compared with the fundamental.
These aren't just theoretical things - if you listen intently to a note played on a 'cello say, you can learn to pick out the component partials. I have a midi clip that one can use to do this as an exercise here - see the [Scales_and_Fractal_Tunes.htm#Newbie_notes Newbie notes]
One may also be interested to read David Canright's article on the harmonic series:
When the component pitches of the timbre form a harmonic series in this way - the original pitch, then double it, three times, and so on - then the timbre is said to be harmonic.
It turns out that many instruments used in music have harmonic timbres, or near harmonic ones. However, bells, gongs and such like instruments have component frequencies that are nowhere near any kind of a multiple of the basic note, and these are called inharmonic timbres.
Makers of church bells actually try to get all the partials as close as possible to a harmonic series, but find they have to have a minor third instead of a major third, which contributes to the characteristic church bell sound. Church Bells also have doublets often - two close together partials that beat with each other - one often hears those beats in a bell sound, or prob., several beating partials at different speeds simultaneously. This is almost part of the characterstic bell sound too, though apparently the bell makers do their best to remove doublets. For more about this, explore the The Sound of Bells site.
A composer may sometimes choose to make a special timbre designed to sound good in exotic scales, with the partials adjusted to be in tune with them, e.g. adjust the third partial so that it is in tune with thirteen equal or eleven equal or whatever scale is of particular interest.
Another approach is to choose a bell / gong with inharmonic timbres, and then use that as the basis for making a new scale that will sound good when playing tunes with that particular instrument - design the scale for the timbre.
Music with scales based on inharmonic timbres, or vice versa:
Jacky Ligon - Galunlati .
Frequency spectrum Options
- when doing an FFT, sometimes one will get unwanted partials that are too low. Minimum is preset to 15 hertz which is so low that it is most likely felt rather than heard.
. , and max, to 8800 hertz, which is d'''' flat, and higher than most musical instruments ever go in normal circumstances. This is the highest one to show in the window, but not the highest one to find - there is no limit there.
Detection of partials in the spectrum
- If your wave happens to repeat exactly at the end of the sample, then there is no need for windowing. However, one needs to be pretty lucky for that to happen. Most often, the sample starts and ends in the middle of a wave.
Now, one could just truncate the sample to the start of the wave, however, Fast Fourier Transforms (the method used to find the spectrum) is much faster if the number of samples used is a power of two. So, there are certain lengths of time that are best to use for FFT, such as 1.486 secs at 44100 hz, for instance. So one usually uses those.
If the sample breaks off in the middle of a wave, then this cut off is detected as a frequency itself. It's treated like a section of a square wave. If the entire sample is only 1.486 seconds, your wave cut off is indistinguishable from a low frequency square wave of frequency 1.486 seconds. The FFT will find not just that square wave, but its overtones too, and they will be added into the results. Probably these partials will be quite low amplitude, but it's enough to make the curve a little less clear.
So, the solution is to use windowing. The idea is that you gradually fade the sound away to zero at either end. Then, the simplest method is triangular - you keep the centre point of the sample at its maximum volume, and then just fade away to either side in a linear fashion. The other windowing methods in the menu are various curves that theorists have produced that work particularly well with FFT. Try them all and see which works best.
- to see this in action, select . Then click + drag on the FFT window to show a detail view, and zoom in to one of the peaks. Try choosing different peak interpolation methods, and click Re-find peaks, and see how the dot changes position.
None = places the dot centrally.
Most of the other methods - looks at the two points to either side of tHE maximum value, and uses those to estimate whether the peak is to the left or right of the central point. For instance, if you have two points almost exactly at the same height, and then one much lower, then clearly the peak is almost exactly half way between those two points. If you have one point that is much higher than the ones to either side, then the peak is at that point (i.e. in the middle of the bar in the chart), and so on. So the basic principle is pretty straightforward. The drop list just shows some of the best methods found by theorists - ones that work well at estimating the correct value for the peak.
The Quinns estimators can't be used if you have selected. You can still choose them, but they will use Jain's method for as long as you continue to use real FFT.
Also, the Quinn's estimators need some information that FTS discards after the calculation (the phase information). So, you need to useinstead of with these.
I find Jain's method is often the best for musical notes.
Web page with more info about these algorithms: How to interpolate the peak location of a DFT or FFT if the frequency of interest is between bins
Mean value - this is a new experimental idea. However, it is now the standard setting as it seems that it gives better pitch detection of partials than even Jain's method, which is three point.
The idea is that you think of the FFT values near a peak (e.g. down to 10 percent of the height of the peak) as a number of votes for that particular frequency. So to find the total, you multiply each frequency bin by the number of votes for that frequency. Add the results for all the bins for the peak. Then at the end, divide by the total number of votes to get the mean value.
You also need to do a bit extra for the left and right ends of the region. One needs to cope with the fact that if you join the points (e.g. using straight lines as 1st approx) then the point at which the curve will cross the 10% value will usually be between two data points.
If you look at a peak in detail you often see a number of smaller pinacles on the top, and often the highest of these is off-centre. Sometimes there may be two of them to either side of the centre. The other methods will just find the highest of these, whatever its position. With the mean value method, if you go down far enough to either side of the peak, you often end up with a value between the two pinacles, in the position that one would choose by eye for the main peak. We are talking about very small details here - sometimes tiny pinnacles on maybe a quite broad plateau comparatively speaking. So they are not ones that would be perceived as doublet partials, just a broadening of the pitch of the partial.
Mouse pos for freq. and Mouse pos.for freq. and amp. These let you re-position the peak by eye - sometimes useful, especially if the main peak in the spectrum has tiny summit pinnacles with the highest of them off-centre (perhaps the highest two of them to either side of the centre).
Mean value option. The lower this is, the more of the peak you take account of; when high, you will find the highest pinnacle in the peak even if it is off-centre. When you select , then you can see how far down the peak the mean value search goes by a horizontal line below the dot for the peak. The region of the peak that lies above that level is the part taken account of to find the Mean value. Sometimes when the peaks for partials are close together, there may be no valley between them deep enough to reach the mark. When this happens, then the mean value search will only go down as far as the deepest valley between the two partials - and it will go down the same distance to eiher side of the peak to keep the symmetry.- standard setting is 10 %. Used with the
- unselect this if the signal has clear partials with little in the way of noise, short lived inharmonics, or multiple close together partials. Select it if the timbre has many very small peaks to ignore.
. This tests the FFT by making a waveform using randomly generated partials. So the frequencies used are known, and you can compare them with the actual values found. The waveform is made for the time shown in . So you can try varying the length of the clip, and see how increasing the size of the sample improves the accuracy. To see what values were used to make it, highlight part of the FFT, and look at in the detail view. Shows hertz, and the values in brackets are the amplitudes in decibels. Also shows the differences in cents between the pitches used to make the clip and the pitches found by FFT.
The original waveform is made espeically for the FFT and discarded immediately, but you can listen to the resynthesised version of it using modelling of the sound of a bicycle spoke using random partials., and . Random partials can make a sound like a gong because of the lack of any harmonic series - see Chris Bailey's
- this tests the FFT using the frequency from the window to make the waveform.
- use this to find the FFT again from scratch.
- use to search to find the peaks in the FFT curve - e.g. after changing some of the other parameters.
If you look at the FFT for a real timbre in detail you'll see dozens of peaks of varying size. So the question is, which ones are significant, and which are not? If you were to add up all the frequencies found - not just the peaks, but the valleys too, everythign in the curve, then you would get the original wave exactly in all its detail - except that the phase information has been lost in the process of doing the FFT - if one kept that too, one would have the waveform exactly, because that is how fourier transforms work.
However, that's quite impractical - we want to find a small set of partials, maybe a dozen or so; certainly not thousands of them. So the question is, which ones to keep. Obviously the largest one found has to be kept, but which of the smaller ones? Some of the smaller ones may also be FFT ripples - a kind of ripple effect caused by the windowing (they reconstruct the fade out at the start and end of the sample) - we'd like to leave those out, and they often appear as a sequence of smaller peaks to either side of a main peak..
Well, one idea is to leave out any that are very close to ones already found and significantly smaller. Then, one could also leave out any very quiet ones that are just above the background sound level. Also, one could leave out any that are sort of small enough so that they get lost in the jumble of peaks and valleys and don't rise much above them.
So, before tweaking the rest of the parameters, let's show what is happening on the graph.
- This shows them for two of the peaks found - the maximum amplitude one and the highest frequency one. To see the curves for any of the other peaks, highlight it, and look at it in the detail view.
- Changes the two curves you see for each peak. So for example, the standard setting of 40 Hz for 20 DB means that the two curves have to go down to 20 Db below the peak at a frequency of +- 40 Hz to either side of the peak.
Maths details: one of the curves is bell shaped, - it's a gaussian, and the other is more parabola like - it's a 1/x type law, more generally, it's an inverse power law with the power chosen so that it will reach the desired Db level at the desired frequency. The inverse power is symmetrical, though it t mightn't look it if you use the standard log plot frequency axis.
- uses it instead of the Gaussian of the freq for peak detection. Gives a slightly flatter curve around the peak and generally follows the typical shape of the FFT partial peak more closely. Also uses the inverse power of the log of the differences for the inverse power curve.
This shows the wave as it is recorded. Clickto start the recording. Stop to stop it. will play the oscilloscope waveform in a loop - i.e. the actual wave showing in the window. will play the complete recording.
Ctrl + will play the oscilloscope waveform as pitch glissandi so that you can hear
what the shape is (esp. for blind users). Plays the shape of the waveform that you hear if you click. To set the length of time for the pitch glissandi: .
This window shows your recording in temporary memory. This is the recording that gets used for frequency analysis, or if you want to find notes in a recording.
It is limited by the amount of RAM you have so you can only use this to record short clips. To make longer clips use the main windowbutton, which records to the file name set using .
Clickto start the recording, and to stop it. If you get an error message when you start the recording, check to see if you have any other audio programs open.
Note that this recording is only temporary at this point - it will get lost when you exit from FTS. You save your recording from.
To zoom in on a detail of the recording, click and drag to highlight it. A small detail window will pop up with menu options to zoom in / out or move the highlight left or right. You can also zoom in further within the detail view by using Shift + left click and Shift + right click to either side of the detail of interest.
You can use Ctrl + click to insert a marker into the recording or Ctrl + right click to remove the nearest marker. These markers get used with the option.
Here you can choose the recording type and the devices for recording with. This includes the option to record in stereo - however later if you want to analyse the waveform to find the spectrum, you will be asked if it is okay to reduce it to mono as FTS finds the spectrum for a mono waveform.
Normal is what one will usually want and the others are just for artistic effect - try them out to see what the do. The usually have a parameter which varies the amount of the effect. In most cases you need to zoom in on a detail of the recording to see the effect.- fills in the interior of the waveform - otherwise it gets shown in outline. You can show the waveform with various fill styles -
- to use this, show the Recording (temp), higlight the region you want to trim to, and use Trim region to highlight.
- this is used to trim silences from the start and end of the recording.
It works by first finding the quiest part of the waveform. This is used as the background sound level, and anything as quiet as that or quieter gets trimmed, or anything just a little louder than it.
The preset setting here is:
200 % 0.2 .
This will work fine with these values in most cases. However, to explain it,: the 0.2 seconds obviously needs to be shorter than the largest silence that you have at start or end (or possibly middle) or your sound clip. The 200 percent here means, trim anything that is at most double the background sound level as found for the average sound level in the quietest 0.2 seconds.
Then set the 0.2 seconds which means that you get 0.2 seconds of silence left at either end of the recording after the trim (where possible of course).- at the preset values you have a trim margin of
Now click. The trim region will get highlighted in the recording. Review it to check that it looks okay, then click the t button.
Some soundcards have a slight bias in the zero position of the waveform - this means you get more of the wave to one side of it than to the other. This is often easy to spot when there is no sound playing as you may find that background noise is shown always to one side of the zero line, always above or always below.
You can see the amount of the compensation below this check box. The correction is done on the basis that a physical waveform would be equally balanced with as much of the wave to either side of the zero line. This affects how the waveform is displayed in the oscilloscope etc, the FFT analysis, and the zero crossings wave count.
Counting wave crossings
This is about how the method works. For details of how to use it go on to the [#find_seed Find seed] section.
The Oscilloscope finds the frequency from a single wave as the standard setting. So for instance, if a complete wave lasts for 1/440 of a second, then one can calculate that the frequency is 440 Hz.
So, what level of accuracy can one expect for a measurement of a single wave at 440 Hz?.
Suppose first that you measure the duration to the nearest sample point. Then, the wave lasts for about 100 samples, so one would expect to get the frequency accurate to within +- 0.5 percent. One might measure 444.2 Hz instead of 440 Hz - out by 16 cents. That's not so bad considering it is a single wave.
However, one can do a lot better than this. Though we have quite limited time resolution, we have very good amplitude resolution, especially with 16 bit sound.
So the idea is to look at points to either side of the zero crossing where the wave switches over from the positive to the negative side. If you zoom in on the crossing point, then many waves are prettty close to linear when they cross the zero point. So we just join the sample points before and after the crossing with a sloping line, and look to see where it crosses the zero point. In this way we can interpolate to get the time for the zero crossing to far better precision than the sample rate might suggest was possible.
If the wave crosses the zero at a slope and is reasonably linear there, then typically, this gets the pitch is accurate to about a cent or two for a single wave, instead of 16 cents.
This is much better than one can do with a spectrum analysis with just 100 sample points as we have here. Indeed, this is more like the pitch accuracy one can achieve with about a second or so of spectrum analysis if one did it without peak interpolation. It is equivalent to spectrum analysis for maybe a tenth of a second with peak interpolation.
I tested the accuracy of the measurements in FTS for short notes using the FM7 to generate test notes - as other measurements done with long clips, so very accurate, show that the FM7 is extremely accurate pitch wise.
Here is five equal played with one second notes on the FM7 init edit voice - a pure sine wave - measured in FTS using wave counting.
240.008 cents 480.005 cents 720.003 cents 960.002 cents 1200 cents
As you see the pitch measurement is accurate to less than a hundredth of a cent.
Here is a just intonation twelve tone scale, again played on the FM7 as a sine wave, this time with notes of only 0.2 seconds, and measured in FTS using wave counting.
203.916 cents 386.322 cents 498.039 cents 701.957 cents 884.363 cents 1088.27 cents 1200 cents
The numbers for that one should be
203.91 cents 386.314 cents 498.045 cents 701.955 cents 884.359 cents 1088.27 cents 1200 cents
Still accurate to better than a tenth of a cent (maximum error 0.08 cents), which is way beyond what one would normally expect for notsSo as you see, wave counting can be very accurate indeed for suitable timbres!
N.B. sometimes you see it said that there is no way to improve on FFT pitch detection because of the classical uncertainty principle of frequency analysis (which is related to the quantum mechanics uncertainty principle).I have been told sometimes that these measurements seem to be too accurate to be possible given the number of samples measured. I don't know enough about these arguments to be able to give the answer to that.
There is no doubt however, that the wave count method works. So - well I leave it to the experts in the field to explore how this method has managed to get around what seem to be limitations to frequency pitch detection. Maybe we are making some assumption here - by assuming that the waveforms we are measuring are steady in pitch when they aren't played for long enough to be able to measure the pitch - but if so - it is a reasonable assumption if we know that the pitches are played by a musical instrument.
The method accurately predicts the pitch of a longer note when you give it e.g. just the first fifth of a second - predicts the pitch to within a tenth of a cent at 440 Hz based on what may seem to be too short a section to be able to do what it does. So - it must be getting something right somehow or other. I'm interested to know if anyone can shed more light on this subject.
Incidentally, another approach is to do FFT of a harmonic timbre and take an average of all the frequencies in the spectrum to estimate the base frequency, and this also can improve the accuracy considerably over FFT even with peak interpolation, for suitable timbres. You can experiment with this too in FTS from, with a strongly harmonic timbre such as strings.
Generally, the wave counting method gives better pitch accuracy than FFT if one can use it.
However, it has limitations as well - it isn't a method for distinguishing partials in a sound. It only detects the basic frequency of the note. This means that if you have several parts playing at once, it will only detect one of them - usually the loudest, or the one with the most prominent or clearest waveform.
If you have several instruments sounding - if they are very pure in sound, you may be able to remove the unwanted ones using a bandpass in a sound editor such as GoldWave or CoolEdit etc.
You should remove any noise from the recording if you can.
It is particularly a good idea to remove any low frequencies as these can cause the waves to undulate up and down, sometimes even so far that some waves miss the zero line altogether. FTS won't be able to analyse such recordings very well.
It also works best if the wave is reasonably regular in shape - sine waves, saw tooth, or any simple repeating shape is best of all, and especially if it crosses the zero point just twice per wave cycle. The more linear it is as it crosses, the better. If there are secondary peaks there are settings that can deal with that.
That's not so bad as it seems - as harmonic timbres give you fairly regular shaped waves, so musically useful sounds have somewhat of a tendency to be reasonably amenable. But it is best with the purer sounds - the ones most like sine waves.
The ocarina voice on a soundcard is pretty close to sine wave usually, so it's a good choice for the best possible pitch accuracy for the wave counting method.
It also works very well for many types of bird song as they are close to sine waves.
Recorder works pretty well. Flute also works, some voice timbres - experiment and see which works for you. If this method isn't suitable, one can fall back on FFT.
If the waveform is a suitable one, then the pitch accuracy depends on the volume of the sound, as the louder the note, the better the volume resolution in samples. A regular wave is often at its steepest at the crossover point, and in that case, typically the step in amplitude from one sample to the next is going to be about 10 in a reasonably loud note. This means that the time for the zero crossing point will be measured accurately to within a tenth of a time step or so, depending on the zero crossing step size.
The result is an extremely accurate pitch measurement. In optimal conditions - a note of at least a fifth of a second recorded at a sample rate of 44 Khz and loud enough to have a minimum zero crossing step size of 10 will be measured with an error of at most +- 0.04 cents (the cents value for the ratio 88002/88000 which is the maximum possible ratio between the interpolated time and the exact time assuming maximum error at both ends of the wave train.).
Note also this calculation gives the maximum error that can occur, not an average. If your measurments of a waveform show that it satisfies the conditions given here, then all the pitch measurements you make of notes of steady pitch of duration a fifth of a second or more will be within 0.04 cents of the true value.
If the note is a bit quieter or the waves slope more gently at the crossing point, so that the minimum zero crossing step in amplitude is 4 (say) instead of 10, more typical, then the pitch will be accurate to within +-0.1 cents maximum error, still tiny for a fifth second note.
First, here is a quick summary of how to play a phrase, record it, and make your recording into a seed for FTS.
First check you are set to record from the microphone from themenu option.
Click. Play the seed, then stop the recording. Finally use the button. You will now hear your seed played back, and you can compare it with the original.
To play back your original recording click.
You may want to quantise the notes of the seed to a particular arpeggio, which you can use from. To use this option, you need to set up the arpeggio you want in the main window. Then when you click the button, you will get the arpeggio note closest in pitch to each of the notes you played.
Ctrl + left click to add markers, and Ctrl + right click to remove them, you can set the boundaries of the notes where you like (show the help of this window for details).. Sometimes you may get too many notes found. In that case, you can remove the extra notes and refind them. Show the Recording (temp)... Now using
Then when ready, click this button to re-find the notes.
- this finds the notes just in the detail window that shows up if you click / drag on the recording to highlight a region of it.
- finds a single note for the entire recording based on the average pitch..
The idea of this is that it can be used to convert your recording to midi format as a form of Wave to Midi conversion. Select as well. Then choose a file name to save your midi file to from . Finally, use as before. FTS will make a midi clip from your recording as it plays the seed.