Physics 204:  Spring 2011

 

Audio Spectra

 

A pure musical tone has a well defined temporal frequency f.  It would also have a well defined spatial frequency k, but we are usually not so aware of that.  More complicated musical sounds may be mixtures of several frequencies, like musical chords.  Somewhat surprisingly, tones that seem to be just a single frequency, like a single sustained musical note, may also be mixtures of several frequencies!  This is less surprising, perhaps, when we recall that so simple a thing as a violin string has many normal modes of vibration, each with its own characteristic frequency.  Thus when a violinist plays what sounds to us like one note, it may nonetheless contain many frequencies (the “overtone series”), corresponding to the excitation of several different standing waves on the string at the same time.  This idea is so contrary to intuition that one really must check it experimentally! 

 

The way to check such assertions about sound is to record a sound trace – i.e., a record of how the pressure in the air at the location of the microphone varied in time – and then to compute its spectrum, a graph of loudness as a function of frequency.  The simplest trace would be a sinusoidal variation in time, and this would correspond to having all the loudness at one frequency, and no loudness at any other frequency.  The spectrum would be a single “spike” at the one frequency.  A more complicated sound would show a trace more complicated than a sinusoidal wave, and its spectrum would be more complicated than a single spike.  The sound from a musical instrument would have in its spectrum one spike for each spatial frequency excited in the instrument.  This is a remarkably easy way to gather subtle information about how things are vibrating.  The process of starting with the sound trace and obtaining the spectrum is called “Fourier transformation”, and it is just a computation.  It is so useful, though, that it was an enormous technological breakthrough when a fast method for doing it was discovered, around 1970, now called simply FFT, for Fast Fourier Transform.  The FFT is used ubiquitously now, and we will be using an implementation of it to get our audio spectra.

 

We will use our computers themselves, with microphones plugged into them, as the recording instruments.  Everything will be coordinated by the program Oscilloscope3.exe (available for free from Gary Darby – credit where it’s due! – through http://www.delphiforfun.org/Programs/oscilloscope.htm).  Run oscilloscope3.exe by double clicking its icon on the desktop.  We will refer to it below as “O3”.

 

When O3 runs, it shows you a green area where the audio trace will be.   Be sure the Trigger is “off”, and click the “Start display” button to see if you are picking up an audio signal.  If you are, you should notice a jumpy signal showing up in the green area.  You may have to change some settings in the Windows operating system to get a signal.  The little “loudspeaker” icon on the Windows toolbar will get you to the right place:  right click it, and choose “Adjust Audio Properties.”  This will pop up a window called “Sound and Audio Devices Properties.”  Click the Audio tab and then push the Volume button of the “sound recording” device.  Move the slider for the microphone volume all the way up, and check the “select” box for the microphone.  Now the system is informed that you mean business with that microphone!

 

Here is the basic operation of this lab:  capture a single frame of the sound trace (be sure there is some interesting sound going on when you hit the “capture” button) and once you have acquired it, hit the Spectrum button.  You will have a sound trace and its spectrum displayed side by side.

 

O3 unfortunately does not have good ways to save or recall such information.  It is true that you can save files, but then what?  We don’t have good ways to look at such files, so don’t save them.  Rather, for any trace or spectrum that is interesting to you, sketch what you see, labeling the important features.  This is really the important thing, because it is the connections you make and the things you observe that are of value, not what bytes go onto some disk.   You must close the spectrum window each time, before going back to take another trace, so if you want to keep it, sketch it.

 

Here are some things to notice as you start interpreting these two ways of regarding sound:

 

The green “trace” screen has time for its horizontal axis and pressure for its vertical axis.  The units of pressure are arbitrary (you can adjust them to some extent with the “Vertical gain” control).   The units of time, though, are meaningful, and are clearly given on the graph, 5.00 ms/div at the slowest sampling rate.  This means you can read from the graph how pressure changed on a time scale of milliseconds, roughly.  If you increase the sampling rate, there are more data in the same small time, and the graph changes accordingly – try it! 

 

The yellow “spectrum” screen has frequency for its horizontal axis and loudness for its vertical axis.  The units of loudness are arbitrary, but the frequency units are clearly given on the graph (in Hz).  If you scroll over to the high frequency end of the spectrum, you find that it ends at exactly one half the sampling frequency,  5500 Hz if the sampling rate is 11,000 Hz.  In the case of sampled data like these, one half the sampling frequency is called the Nyquist frequency, and it is the highest frequency that can be detected.  That is why the spectrum ends there.  The signal could have higher frequencies in it, but for a rather odd reason these components would not show up as higher frequencies in the sampled data, but rather as lower frequencies!  Thus one must be on the alert for odd behavior at the high frequency end.  In fact, I notice in this application that the spectrum is only trustworthy up to one half the Nyquist frequency, about 2700 Hz if the sampling rate is 11,000 Hz.  If you are interested in higher frequency signals, go to a higher sampling rate.  For many purposes, though, I find the interesting sounds are at frequencies below 2700 Hz and the slowest sampling rate is actually the best. 

 

The FFT produces a discrete spectrum, i.e., it gives the loudness only at multiples of a fixed lowest frequency.  This is not to say there couldn’t be signal at other frequencies, in between the frequencies that are reported.  It is just that the signal is “binned” into discrete frequencies, like a histogram rather than like a continuous function.  By the way, what is the lowest frequency that can be detected (apart from zero frequency, which means “not changing”)?  That is, for what frequency would a signal change so slowly  that it seems not to be changing at all?  Well, that depends on how long you sample it.  If you sample for a time T – this would be the total width of the “trace” window – then you could see a frequency as small as 1/T, but not smaller.  The program O3 has a built-in choice about how long to sample the sound.  This determines the lowest frequency in the spectrum window. 

 

(1)    Here is your first assignment:  use the observation above to determine from the spectrum window how long O3 samples the sound to produce a trace.  Do this for each sampling rate.  You will find that not quite all of the sampled trace is actually displayed in the green area, although most of it is.  Include this determination in the lab essay you write at the end of this lab.

 

(2)    Now have fun!  Find a source that produces a sound at just one frequency.  Keep sketches and explain how you can determine this frequency both from the trace window and from the spectrum window.  Include an example of this in your essay.

 

(3)    Produce a sound an octave above or below the previous one.  How does its frequency compare?  How about a major 5th above or below?

 

(4)    Find a source that produces an overtone series, and describe its trace and spectrum.  What musical tone do you hear?  Change the tone by an octave.  Now how does the spectrum look?

 

(5)    Speak vowel sounds like long E, short A, long O, etc.  What does the spectrum of your voice look like?  Is it reproducible?  How do the vowel sounds differ among each other?  Do they look the same for all of your team?  Could a computer be programmed to distinguish these sounds?

 

(6)    Capture a transient sound.  Set the Trigger to ‘+’, and move the slider for the trigger level to a bit above 0.  Now when you try to capture a screen, the system waits for a sound loud enough to trigger the recording.  (If you see an error message in the lower left, push the “capture” button again.)  What does the trace of your transient signal look like, and what does the spectrum look like?  Does it have any particular frequency?

 

(7)    Try out some ideas of your own – at least one from each individual team member – and include these ideas, and how they turned out, in your essay.  Share your discoveries with others!