Audio -- basic technical information

by Ursula Hoffmann

Flavors: Mono (one-channel) or Stereo (two channels, Right/Left): interleaved is in a single file; or split-stereo is in two separate files.

Sample Rates indicated in Hertz (Hz), or "cycles per second":
Use 44,100 Hz (44.1 kHz) =  CD-quality sample rate for professional audio work  Each sample has 16 bits of information.

File sizes
This amounts to a tremendous amount of information: 2 tracks * 44,100 samples/second * 16 bits/sample = 1,411,200 bits/second.
CD quality audio stereo, 16-bit, 44.1 KHz = 176 kbyte/sec is too high for CD-ROM (2x drive ~ 200 kbyte/sec sustained ) or modems (28.8
modem ~2.88 kbyte/sec). See Compression below.

22,050 Hz may be good enough for some interactive multimedia applications.
11025 Hz is a low-resolution, "voice quality" sample rate.

A good rule of thumb: Each minute of 16-bit stereo sound at 44.1 kHz requires about 10 Mbytes of disk space.
Thus, with an empty 200 Mbyte hard disk you can record a little less than 20 minutes of CD-quality stereo sound (precisely, 19 minutes, 20 seconds).

CD quality audio (stereo, 16-bit, 44.1 KHz  = 176 kbyte/sec) is too high for CD-ROM (2x drive ~ 200 kbyte/sec sustained ) or modems (28.8
modem ~ 2.88 kbyte/sec).

Audio files compressed to other formats, such as rm or MP3, might be smaller, take less disk space and transmit faster.

Examples:

../../index.html sonic1ac.wav 155 KB
13 sec

sonic.rm (same as above)
65 KB
13 sec
greater bit depth
sonic1ac.mp3 (same as above)
225 KB
13 sec
../audio/heartsounds.html 00b10001.wav 62 KB 3 sec



Possible compromises between sound quality and file size:
Depending on your intended use for the audio, you may be willing to trade some quality in order to reduce the amount of information needed in a digitized sound. Here are some things to consider:

Stereo can often be collapsed into a mono (single track) audio file. If the two tracks are summed, all the sound information will be there, but the directional information is lost. Since computer speakers often are not separated by a suitable distance, even stereo signals are compromised. Going to mono will reduce the file size by half.
Sampling rate: Computers often offer 44K, 22K, 11K, and 6K sampling rates (or numbers very close to these). The sampling rate is a large factor in the sound quality of the digital file. 22K or 44K rates are needed for full-range sounds, while speech is often acceptable at 11K. As you lower the      sampling rate, the sound loses its higher frequencies, so to reproduce the calls of songbirds you may need 44K, but a voice-over may be fine at 11K. To be specific, the sampling rate needs to be twice the highest frequency that is to be digitized.


Bit-Depth: In addition to multiple sample rates, you may be working with either 24-, 16- or 8-bit files.

Lower resolution formats reduce storage space requirements. Unfortunately, lower bit depth and sample rate settings can compromise the audio quality of your sound files. Lower sample rates lose high frequency response, and 8-bit storage causes a reduction of your sound's dynamic range, resulting in noisier, "grainy-sounding" audio, especially during softer passages.

If you are creating 8-bit audio (for example, for multimedia or Internet distribution), you will get best results if you do all your signal processing at 16-bits and 44.1 or 48 kHz, and then create an 8-bit file at the end of the process.  (The quality of 8-bit files can be mediocre.)

You can use an MP3 encoder to convert a file. But its bit-depth is 128 so the file size may increase.


 

Calculating hard disk free space requirements:
Working with computer-based digital audio requires large amounts of hard disk space. If you are planning on creating new audio files on disk, you'll need enough hard drive space to contain them.

Example: I recorded 13 seconds of sound in the ITC corridor, 16 bit stereo 44,100 hz: This is a 2.5 M file and definitely needs to be compressed.

(For speech, you need only 8 bit mono but the file size will still be big.)

Audio File Requirements in bytes per second:

number of samples (sampling frequency in Hz)
multiplied by Sample size (1=8-bit,2=16-bit; eg., 8-bit/sample divided by 8 bits per byte)
multiplied by Channels (1=mono, 2=stereo)

A good rule of thumb: Each minute of 16-bit stereo sound at 44.1 kHz requires about 10 Mbytes of disk space.
Thus, with an empty 200 Mbyte hard disk you can record a little less than 20 minutes of CD-quality stereo sound (precisely, 19 minutes, 20 seconds).

CD quality audio (stereo, 16-bit, 44.1 KHz  = 176 kbyte/sec) is too high for CD-ROM (2x drive ~ 200 kbyte/sec sustained ) or modems (28.8
modem ~ 2.88 kbyte/sec).



Codec (compressor/decompressor) compression possibly to reduce file size:
IMA compression works quite well for CD, but is cross-platform only with QuickTime.
MPEG 1 -- CD quality
MPEG 3 = MP3 and RealAudio are the most popular of the numerous solutions for Web audio.
For music, MIDI is the best solution for both CD-ROM and Web delivery.

Credits for the above material:

Arboretum Systems Hyperprism software manual: http://www.arboretum.com/support/manuals/manual_hvst/Files/hppc_digital_audio.html#anchor481814
and a site at Cornell University:  http://www.cit.cornell.edu/atc/materials/dig/avaudio.shtml
and a site at San Francisco State University:   http://msp.sfsu.edu/Instructors/rey/video/bandwidth/filesize.html



Links/Sources on the web:
Duke CIT Resource Guides  audio workstation  --

Audio guide for web developers:  http://www.walthowe.com/pubweb/audio.html
Comprehensive list of audio file formats: http://www.sonicspot.com/guide/fileformatlist.html
About digital audio files:  http://www.arboretum.com/support/manuals/manual_hvst/Files/hppc_digital_audio.html
Windows Media Player multimedia file formats:  http://support.microsoft.com/default.aspx?scid=kb;EN-US;316992 
Digitizing audio and video:  http://www.cit.cornell.edu/atc/materials/dig/videoformats.shtml

last revised April 2005