How to Solve Audio Problems with the Use of Dynamic Processing
M. MorganThe general field of video distribution, whether broadcast, cable TV, CCTV or teleconferencing, seems beset by audio-quality problems. Some of these difficulties are peculiar to special segments of the field, while others are widespread. By the judicious and proper use of dynamic processors, and adjunctive equipment currently available, the vast majority of these problems can be successfully resolved.
No tutorial concerning the treatment of program audio can possibly be complete without stating some basic criteria by which audio quality can be judged, both on an objective, measured basis, and subjectively by the operator. Both sets of criteria must be understood and applied in order that the operator can identify problems and successfully deal with them.
The area in which objective, measured performance is most easily evaluated is at the point of origin; it stands to reason that any degradation of the audio signal occurring at the source will be difficult, if not impossible, to counteract at its ultimate destination. When evaluating audio quality in the production room, or at the point of origina, the entire signal chain should be considered. The engineer must be prepared to perform the following basic measurements:
* Frequency response, referred to some standard level, usually at 500 Hz or 1 kHz. A well-designed system should exhibit no more than [plus-or-minus] 2-dB level variation from 30 Hz to 15 kHz. Professional audio equipment such as production consoles, recording desks and tape-recording machines should exhibit less than [plus-or-minus] dB-variation from 20 Hz to 20 kHz.
* Residual noise and hum, referred to standard operating level (0 Vu), and measured unweighted with appropriate filter sets inserted before the measuring device. The noise floor in a 30-Hz to 15-kHz noise bandwidth should be at least 60 dB below 0 Vu with all equipment in the signal chain adjusted for normal operation.
* Distortion at standard operating level, preferably IMD (per SMPTE), which gives a more accurate indication of objectionable distortion measurements. The IMD products should be well below 0.3 percent for the entire system, while THD measurements taken at different frequencies from 30 Hz to 10 kHz should be 0.1 percent or less.
* Headroom before clipping, referred to standard operating level and measured at the level where clipping is observed on an oscilloscope monitoring the signal chain output, or at about 1 percent THD. The clipping point should be at least + 10 dB (reference 0 Vu) at any frequency from 30 Hz to 10 kHz.
These criteria, if met, indicate that the system is capable of replicating most commonly encountered audio program material, including music, without objectionable amounts of noise and distortion, or loss of intelligibility.
Subjective evaluation of audio quality is impossible to translate into numerical values and difficult, even, to express idiomatically. For this reason, it has been regarded as arcane by most engineers, and has been reduced nearly to absurdity by some audiophiles; this is indeed a pity, since the prime criterion the engineer must ultimately apply to audio product is its affect upon the consumer. The engineer can and must learn to evaluate auddio quality in the same frame of reference as does the average listener.
In listening to audio program for evaluation purposes, engineers must concentrate on content, and note their own reaction to the product, asking the following questions:
* Is the program objectionably "loud"? Loudness bears only an approximate relationship to actual volume. If one can reduce the volume level noticeably and the signal retains its objectionable loudness, this usually is an indication of moderate amounts of clipping in the low and mid-frequencys portions of the program, or of extremely high amounts of compression causing dynamic distortion in the low and mid frequencies, and/or tonal unbalance resulting in a preponderance of low-frequency information in the 30 Hz to 200-Hz range. Objectionable loudness usually occurs during musical passages, and is nearly always accompanied by a distorted sound.
* Are there objectionable sounds or noises present in the program that do not belong? Such extraneous sounds as room noise, chair squeaks and background conversations are especially objectionable during live broadcasts or teleconferencing. Hums and buzzes and impulse noise present in musical program are objectionable, especially during low-level passages.
* Is the sound distorted? Every engineer is familiar with the characteristic sound of clipping. Is that sound present in the program? Less familiar, perhaps, is the type of sound associated with intermodulation distortion. In a musical program, large amounts of IMD produce a muddy, "closed-up" sound with lots of information not musically or harmonically related to the program, often referred to as "subharmonics." The most graphic description is probably "muddy," or "strident." The most graphic description is probably "muddy," or "strident." This type of distortion quickly causes listener fatigue, and is very distracting.
* Is the noise floor being modulated? Listen carefully to the noise level during quiet passages in the program material. Does the "hiss" increase in level? Does it vary rapidly with changes in program level? If so, this is probably the result of compression, and is quite objectionable in program sources with marginal signal-to-noise ratios (60 dB or worse) and is extremely bojectionable in highly equalized feeds.
* Does the program have pleasing tonal balance? Is there overwhelming "bass" resulting in booming sound? Are there excessive amounts of "treble" or high frequencies, resulting in a "tinny" sound?
Trust You Ears on Quality
In short, those things that you find objectionable will be objectionable to the consumer. Use your ears! Trust them! No matter how convincingly your instruments indicate that all is well in the signal chain, if your ears tell you all is not well, you'd better believe it.
To best use a tool, one must thoroughly understand its functions and limitations, thus a discussion of dynamic processors, their functions and limitations is in order.
The most commonly used, over-used and misused processing device is the compressor. This device compresses the dynamic range of the signal passing through it by constantly changing its gain in response to signal level. It may be considered a type of automatic gain control (AGC), but differs from AGC devices in that its response times usually are much faster. The compressor both reduces its gain in response to signal levels above a setpoint, called the threshold, and increases its gain in response to signals below the threshold. A compressor has a well-defined rotation point at which the gain control element exhibits unity gain, and usually has an adjustable ratio, which is the relationship between the input signal level and output signal level. Figure 1 shows the typical transfer function of a compressor, and illustrates the concepts of ratio, rotation point and recovery, or makeup gain.
Compressors generally are used to enhance loudness by producing denser modulation of the medium onto which the audio program is imposed. The judicious use of compression results in a pleasant "even" volume contour in the processed program. Problems associated with the misuse of compressors are "pumping," in which the gain changes are noticeable and objectionable, "breathing," in which the noise floor is noticeably modulated in response to the program level, and "shimmering," in which this high-frequency signal level moves up and down drastically in response to low-frequency content, or, in the case of stereo program feeds, when the high-frequency information seems to deviate rapidly from side to side in the stereo image. These compression artifacts are all forms of dynamic distortion, and can be minimized by correct adjustment of ratio, attack time (the time required for the compressor to reduce its gain a prescribed amount) and release time (the time required to recover makeup gain). A highly compressed program using short attack and release times usually exhibits exaggerated loudness, and is very objectionable due to intermodulation products resulting from the level of mid-frequency and high-frequency portions of the program varying rapidly in response to the lowfrequency components of the material. The most common complaint about compression concerns an inherent problem called recovery noise, or "pull-up" which the noise floor level is boosted by the makeup gain in the absence of signal or during low-level passages.
Limiters and Compressors Not the Same
Another commonly used processor is the limiter. Although the terms limiter and compressor are used almost interchangeably, they are not the same. A limiter controls the maximum amplitude of the signal passing through it by reducing its gain in response to signal levels greater than the limiting threshold. It does not increase its gain in response to signal levels below the limiting threshold. Figure 2 illustrates the typical transfer function of a limiter. A limiter usually exhibits very short attack times and rather exaggerated ratios of 20:1 or greater. Limiters find their greatest use in controlling peak program level and in "capturing" transient voltage excursions in the processed material.
The indiscriminate use or over-use of a limiter results in a flat, squashed sound, often accompanied by pumping. This is a result of the dynamic range of the program being artificially limited to just a few decibels as opposed to the 40-dB-plus average dynamic range inherent in most musical program and in speech. Additionally, a limiter exhibiting a relatively slow attack time (greater than 60 microseconds) is prone to cause distortion upon attack, and also allows transient material to pass through it relatively unaffected.
The expander is perhaps the least understood of the basic dynamic processing tools. This device expands the dynamic range of the signal passing through it by altering its gain in some direct relationship to signal level relative to a setpoint, or threshold. A direct expander will increase its gain a predetermined amount in response to each unit of signal level increase above its threshold, while a downward expander or reverse expander will decrease its gain a predetermined amount for each unit of signal level decrease below its threshold.
Most modern expanders are downward expanders, since this configuration offers greter ease of operation as a noise reduction device, while direct expanders are limited mainly to use as "noise gates" or "sonic gates." Figure 3 shows the transfer function of a downward expander, and illustrates the concepts of threshold and slope, which is the ratio of output level to input level when the device is expanding.
Expanders find greatest use as direct noise-reduction devices, wherein the threshold is adjusted to a point just at or below nominal operating level, and the slope is adjusted for subtle "strectching" of the program dynamic range, such as 2:3, 4:5 or as a sonic "gate," with the threshold adjusted above the noise floor level and the slope at a more exaggerated setting of, say 1:2 or greater.
An expander featuring variable attack times and slopes can be used to restore the dynamic range of highly compressed programs, thus minimizing the exaggerated and objectionable loudness caused by incorrect use of compression.
The incorrect use of an expander can result in several unusual artifacts. Since an expander, unlike a compressor, "attacks" to unity gain, and "releases" to attenuation, the use of very short attack times may create a "pop" or "click" upon attacking. Incorrect setting of the threshold control can have much the same effect, as well as creating "pumping," which is, in this case, exaggerated changes in gain caused by variation in program level. Expanders also create a unique artifact known as "holing" or "perforation," wherein the noise floor, and/or low-level signals rapidly switch on and off in response to transient program levels or impulse noise inherent in the program. These artifacts can be minimized by proper adjustment of threshold, attack time and release time, and by using the s more subtle slopes (1:2 or gentler).
Perhaps the most frequently encountered problem in program distribution is that of widely varying audio levels. One typical installation reported access to 24 feeds with audio levels ranging from --40 dB (reference 0.775 Vrms) to +8 dB. The operator attempted to resolve the problem by installation of booster amplifiers on the low-level feeds, but found the average levels to vary over 20 dB, thus the --40 dB feed, after boosting, would periodically rise to --20 dB, thus causing clipping and overload of the distribution amplifiers feeding the modulator. The solution, in this instance, was to use a compressor having an exaggerated ratio of 60:1 and a very long release time of about 5s for each 20 dB recovery. The recovery gain thus provided the needed boost, while the dynamic control offered by the compressor assured that the output level never exceeded the nominal line level required by the modulator. In order to minimize pullup, the compressor incorporated an expander as an integral part of its control circuitry; thus, in the absence of signal, the expander reduced the output noise by about 40 dB. This configuration can be realized using a separate compressor and expander; however, great care must be taken to provide symmetrical attack and release characteristics for the two devices, making the transition from compression to expansion imperceptible, thus eliminating pumping or perforation. Figure 4 shows the transfer s function of an interactive compressor/expander unit designed to perform this type of function; the ratio and slope illustrated are less-exaggerated than those used in this particular instance.
Another peculiar problem reported by a teleconferencing operator concerned impulse noise that occurred as an artifact of a digital decoding error. The audio portion of the signal exhibited a rather low signal-to-noise ratio, and the "crashes" due to decoding errors caused overload in the distribution amplifier, in addition to being extremely objectionable. This problem was corrected by using a fast-attacking, fast-releasing peak limiter to control the impulse-noise level, followed by a 1:2 expander to enhance the perceived signal-to-noise ratio of the audio feed. This particular device consisted of two multi-function processors housed in a single 1.75-inch rack package, and provided the operator with a very simple and cost-effective solution.
By understanding the capabilities of dynamic processors, and mastering their application, the engineer can be prepared to solve problems, enhance the quality of the audio product and create useful effects that are limited only by the imagination and a willingness to experiment.
COPYRIGHT 1985 Nelson Publishing
COPYRIGHT 2004 Gale Group