 | |  |
|
6 may 2008 Attack and Release |
|
|
I sometimes get asked for the exact attack and release times of my CT4 compressor. But, there is no standardized way to measure a compressor's speed in terms of seconds or milliseconds. Whenever a designer puts absolute times on the dials, they are just making up those numbers according to their own set of rules. I find the ms/sec perspective quite perplexing. I've never attempted to measure the CT4's speeds based on any set of rules -- not because I'm lazy, but because | | it's wholly unimportant to me and the thought of trying kind of makes my head hurt as the separate halves of my brain battle it out in a fight that ultimately "does not compute." We're all making aesthetic choices here (you & I), not launching spacecraft, and every compressor has its own sound regardless of what those little numbers say. Otherwise, we'd just buy one compressor that followed some perfect, clinical rules of attack, release, ratio, etc. and be done with it. | |
 | |  |
 | |  |
|
29 apr 2008 Audio Dithering -- Salt, conventional wisdom, & the supernatural. |
|
|
I was recently reading a cookbook. The author was emphasizing the importance of using good salt over cheaper, generic salt. He described the flavor differences and insisted it had an important impact on the final dish. I chuckled with skepticism. But, I happened to have two different brands of salt in the cupboard. One was a domestic brand, a sea salt of an unknown origin. Its container described how it was better because it didn't have chemical bleaching agents and retained the natural trace minerals. The other was a more expensive imported naturale Italian sea salt. So, of course, I performed a taste test. To my huge surprise, there was a very significant, undeniable difference in taste. The locations of sensation on the tongue were also different. The Italian brand easily won out. It was smooth, pleasant, and the flavor finished evenly. By comparison, the other brand was strident, piercing, and had a synthetic chemical-like flavor and feeling. Both immediately evoked the idea of salt -- but, side-by-side, the differences were almost like that between real sugar and an artificial sweetener. I presented this taste test to a handful of other people, and all the reactions have concurred with mine. I will certainly never think, or rather not think, of salt the same. The cookbook writer was correct. I've found it does make a subtle but worthwhile impact on the end product, making the eating experience just a little bit more enjoyable, especially after all that hard work in the kitchen.
So what does salt have to do with audio dithering? Dither and salt both speak to the issues of marketing, personal enjoyment, and conventional wisdom.
Conventional wisdom says that all salt tastes the same. It's salt! NaCl. Sodium chloride. How could one box of NaCl taste that much different from another box of NaCl? Conventional wisdom scoffs at the idea that one brand might taste much better than another. Conventional wisdom laughs at the idea of spending five dollars for a can of it, rather than a buck for that old Morton's.

Conventional audio engineering wisdom says dither's a bit like salt -- you add a pinch of noise to "enhance the flavor." Conventional wisdom says that you'd be crazy not to dither when mixing down to a 16-bit master. I say: Conventional wisdom isn't all that wise. Stop bothering with dither! Unlike salt -- it has no impact on the end product.
I know what a lot of people will say to this: But, I've heard the improvements of dither with my own ears! I've done the taste test. I've read the book. And, I've seen the movie!
For example, you can go to this website and evaluate all the major players in dithering technology. It's true -- you can absolutely hear the improvements in their examples. And, that particular algorithm with the most votes does sound the best. But, let's read the fine print about the chemical bleaching agents: "54 dB of gain has been applied after dithering." Since the audio isn't audibly clipping, this means there was at least 54 dB of headroom in the original digital signal. 54dB of gain equals a multiplication factor of 501. In binary, this equates to shifting the digital word almost 9 bits (501X = 2 to the power of 8.97 bits). In other words, the original audio had about 9 unused bits of headroom. 16 minus 9 equals 7. We're listening to the equivalent of 7-bit audio here. So, what does this listening test prove?
That dither is highly effective for 7 or 8 bit audio. How many recording engineers do you know sending out 8 bit masters?
Why did the creators of this test use such bit-reduced audio? They had to or the test never would have worked. Quantization noise, which dither is meant to mask, is not humanly audible for typical, normalized, 16-bit audio, especially not in the everyday listening environments where we experience music. 16-bit audio has over 90dB of dynamic range -- this is the difference between the maximum signal amplitude and the quantization noise. In other words, the quantization noise is 90dB below what you're actually listening to. That's huge! Think about that in real world terms: 85dB SPL is the standard peak loudness calibration of a movie theater. It's also the threshold of hearing damage at long durations. Under this condition, 90dB below that is well under the generally accepted "threshold of human hearing". Expecting to hear the effect of dither is like expecting to hear the fluttering of a fly through the din of a jackhammer -- it's just not going to happen. Maybe... just maybe, with a highly un-normalized Classical CD from the 80's listened to in an anechoic chamber with the lowest-noise-possible audio reproduction gear, you just might barely detect hints of quantization noise on the quietest passages. But, I'm dubious -- who do you know that has an anechoic chamber for a living room anyway?
To further this assertion, why is it so difficult for audio engineers to figure out how to apply dither properly? There's been numerous magazine articles, white papers, tidbits in the many home | | recording books, and thread after thread on the internet forums asking: "How do I dither?", "When do I need to dither?" , "Should I re-dither?", "Does dither work with floating-point audio, or only fixed-point?", "On what insert do I put the dither plugin?" "Can I EQ the audio after dithering?" And, these questions never end.
If dither makes such a perceptible improvement to fidelity, then shouldn't it be completely obvious when that plug-in gets latched into the correct slot and the audio is flowing through the correct path? Shouldn't the standard, somewhat flippant, internet forum answer of "Just use your ears!" be applicable here? It's not -- no one ever says this with regards to dithering. That's because, it's hard to stand behind such a statement with any confidence about a technology that does not exist.
A while back I discovered that the TDM POW-r dithering plugin that's provided with Pro Tools had a bug, making it incompatible with my L2007 limiter plugin. The POW-r plug-in would stop producing dither when activated on the same DSP as the L2007. (My limiter, unlike most others, does not provide integrated dithering, so I've always suggested to people that they just use the POW-r plugin.) I discovered the bug on my own, more or less by accident, while using an FFT analysis plugin. But, guess how many users called to report it themselves? Zero. Just like myself, no one ever heard this on their own. Only with tweaky analysis tools is dither measurable.
Finally, let's consider the end product. When have you ever listened to a CD and thought, "Wow! This is a really incredible album. But, damn! The dithering algorithm that the mastering engineer used is total crap! I can't believe they used POW-r type 3 instead of type 2!" Never. No one has ever had this thought. Regardless of its questionable efficacy, no one can reasonably argue that dither has an impact on the experiential enjoyment of music. Unlike other aesthetic choices of audio production (compression, distortion, etc.), no one has ever argued that dither ruined or enhanced their ability to enjoy a musical work.
The casual listener is capable of perceiving all subtle aspects of music production, whether or not they speak the language of audio engineering. It may be difficult for them communicate what they are hearing. If the vocal is slightly distorted, or if the phase is mutated due to excessive equalization, they may wave their hands around or find a peculiar phrase that best fits the tenuous categorization scheme their mind has created for this language-less reality. Likewise, we can communicate and agree about the flavors of salt, even though no one has previously defined a rigorous categorization scheme of salt. But, dither -- the only people on the planet perceiving and discussing dither are those people who have been informed of its existence -- because dither does not exist in anyone's reality until someone gives it a name.
So, where does the mythology of dither originate? Well, the problem is, it is a mathematical truth. There's no denying that. You can write down and solve the equations. Dither does work in the abstract sense: it decorrelates the quantization noise from the carrier signal. But, again, this theory is only applicable to the human experience for very low-bit, highly-quantized audio signals -- simply put, when the quantization noise gets "really loud", you can hear it.
If dithering makes no difference, how has the concept lived on for so long? I think because it is mathematically validated, it has allowed the engineers at pro audio companies to say, with confidence, to their marketing departments that the inclusion of dithering has made their product better over the competition. That's great! I imagine marketing people love positive specifications and catch-words which they don't have to create themselves. Anything that is easily quantified and can be succinctly composed into the marketing text is great for the unimaginative salesperson. Examples of such marketing-driven myths are common in the technology market. 64-bit floating-point audio! Awesome! How in the world could that not be better than 32-bit floating-point audio? 64 is twice the size of 32. Six Megapixels is of course better 5 megapixels! (Like with most bullet-points and talking points, these metrics ignore the subtleties and/or disregard all the other equally important elements that were compromised in reaching that isolated specification.)
But, I think the primary reason dither lives on is because the public itself has embraced it wholeheartedly, beyond simple influences of marketing. Why? I believe it's a bit like humanity's attachment to the supernatural. People seem to posses, myself included, an overwhelming desire to imagine their perceptions of the world to be more subtle and more magical than they actually are. Even though no one has ever truly witnessed the effects of dither, the public has faith in its power. And, since it's impossible to prove or refute something that you cannot hear (or see, or touch), the ghost of dither lives on...
| |
 | |  |
 | |  |
|
12 jan 2008 Stereo vs. Multi-Mono |
|
|
User Robert Furlong asks: "Quick question about the L2007 when used on the master fader. I can hear a difference between using it stereo and multi mono. Is there one way that makes more sense or is it a matter of taste. I ask because I noticed that some of your plugins can only be used as multi mono but this one goes both ways. I like the wider stereo image multi mono gives but on some songs the vocals seem a bit thin and undefined this way. Also, are there any tips you have for other things that are usually "better" as stereo or multi mono plugs (dither for example.)"
Good question!
Regarding the limiter specifically, the question of multi-mono vs. "true-stereo" is both a matter of taste and a technical concern. True-stereo analyzes each channel independently, but applies the same gain reduction equally to both channels, while multi-mono processes each channel completely independently. Equal application of gain reduction is important because when the left and right audio signals become significantly uncorrelated, two separate compressors will function like an out-of-control auto-panner, causing a rapid flipping of the loudness between the left and right channels. This is heard as an unpleasant-sounding mutation of the stereo image, resulting in things like the vocal thinning you describe. It should be noted that this issue is of concern for any stereo compression, analog or digital, and not a peculiarity of the L2007.
Typically, the highest peaks in rock material are contributions from the drums or other percussive instruments. Those instrument tends to have quick transients. So, if you gently apply unlinked stereo limiting, the stereo image is often nicely preserved since the limiters react in very quick bursts, and the brain does not perceive the amplitude | | modifications as a shift in panning.
With more aggressive unlinked limiting, you're going to start digging into the meat of the signal for much longer periods of time and the result, again, will be undesirable stereo-image shifting.
So... perhaps two plug-in inserts in series could yield the most open-sounding results: The first insert would be a multi-mono version and set to catch just the quickest peaks, and a subsequent stereo version, providing the remaining desired compression. Again, the key to this technique would be setting the multi-mono limiters so they are only performing very quick, isolated moments of gain reduction.
This idea is also vaguely similar to the technique of Mid/Side processing, which is probably a more effective and elegant way of achieving higher compression levels while preserving and even enhancing the stereo image spread.
For a lot of other plugins, like an equalizer or dither, there is absolutely no sonic difference between multi-mono and stereo versions because there is no interaction between the left and right channels' processing. The stereo/multi-mono option exists purely for work-flow considerations. (Some history: the multi-mono system was added in the v5.1 release of Pro Tools to help support its new surround-sound capabilities. Multi-mono allowed existing mono plugins to scale across surround channel configurations, thus easing and simplifying the transition for plug-in development. Subsequently, it also proved useful for stereo channels.)
The L2007 and CT4 are the only stereo plugins I currently offer that work in a linked-channel manner. Stereo versions of vt3 or Tape-Head are sonically identical to using a linked-control multi-mono insert. (THC and TD5 do not support stereo altogether.) | |
 | |  |
 | |  |
|
7 nov 2007 The TDM Price Conspiracy |
|
|
There's a long-standing complaint in the Pro Tools community about the higher pricing of TDM plug-ins versus RTAS. The foundation of these debates is built on an implication of malicious intent by the plug-in developers -- that they all schemed a devious plan to bilk the customer. Such emotionally tainted discussions are often difficult to combat with logic, so it's sometimes | | best to steer clear unless you're a clever manipulator of language. (I'm not.) But, when this thread came up yet again recently on the Gearslutz forum, after a morning of way too much coffee, I finally tried to address it with a little math, basic facts, and a dose of sarcasm: Gearslutz Post | |
 | |  |
 | |  |
|
2 sep 2007 Why some compressor plug-ins have latency. |
|
|
People sometimes ask me why my CT4 compressor plug-in has one sample of
processing latency while many compressor plug-ins have none. I've also
seen this question raised repeatedly in message forums with respect to other
plug-ins, such as Digidesign's Smack. In these posts, there's often a
sense of frustration that the developer is at fault for leaving this oversight
in the code. "C'mon, it's just one
sample. Just get rid of it!" -- a pretty amusing implication from
my perspective as the designer. Read on to understand why.
From what I understand, the Smack plug-in models a real-life compressor that
employs feedback sidechain sensing (as opposed to a feedforward design.)
The CT4 is also of the feedback variety. And, not by coincidence, it also
has one sample of latency. If you're unfamiliar with compressor
topologies,
here's
an article that discusses feedforward vs. feedback designs, in the context of
hearing aids -- something we'll probably need to know about eventually anyway :)
In the analog world, electrical signals run through a device like a system of
chains wrapped around cogs. Pull one end of the chain and the other end
reacts immediately. There is no concept of processing | | latency. We're
talking about electromagnetic waves moving at the speed of light.
In a feedback compressor, the output signal feeds the input of the sidechain
detector. Still not a problem in the analog domain. Now we have a
loop in the system of chains and cogs. But, the output still reacts
instantaneously to the input. Likewise, the wheels of a bicycle do not
wait around when you push on the pedals.
Things are different in the discretely-sampled digital domain. In a
modeled feedback compressor, generation of an output sample first requires an
input sample to the sidechain "circuit." But, there won't be an output
sample to feed back to the sidechain until the algorithm has run through one full
sample cycle. Hence, the source of that 1 sample of delay.
So, why can't I just work a little harder on the code to get rid of that delay? Well, it's hard to fight physics. The theory goes that for
every feedback compressor design there's an equivalent feed-forward
implementation, and vice versa. So, the developer could perhaps, with much
effort, transform it into a feed-forward model, and hope that the equivalent
feed-forward algorithm results in zero latency. But, as with all
engineering problems, there are always trade-offs. The net result would be
something else to complain about, and my guess is that it would be a much higher
CPU usage.
| |
 | |  |
 | |  |
|
19 jul 2007 Headroom, Gain Staging, and the Loudness War? |
|
|
On the Gearslutz forum, Zoff asks "The readings on this meter
[Massey
HR Meter plugin] don't seem to have any correlation to the Pro Tools master
fader meter. Can someone please shed some light on this? Thanks."
Thanks Zoff, I've been meaning to! So here goes:
Actually, the Pro Tools meters don't have any markings to make any correlations.
Pro Tools' poor metering has been a long-standing complaint by many users.
"Why does the HR Meter default to +12 dB instead
of 0dB at digital full-scale?" is probably what's being questioned
here. Well, the HR Meter is configurable. Change it to 0 dB at
full-scale
if you like.
"Why is full-scale nearly always defined as
0dB?" would be my question. Well, truly, it's a purely arbitrary
convention -- simply numbers plopped down along a number line. The decibel
scale has relative meaning, not absolute. Moving 6 dB in either direction
along a dB meter equates to either a doubling or halving of the signal's
amplitude. (6.0206 dB to be a bit more precise). You can make the minimum and maximum values whatever you like,
as long as this rule is observed.
So, why did I pick +12 dB as the default maximum? Well, a number of
recording engineers advocate tracking recording levels to around 12 dB below
full-scale. By making the maximum meter reading +12 dB, it places this
"ideal" tracking target at 0 dB. This approach then gives you 12 dB of
headroom in the digital signal.
"Why can't I keep the meters calibrated to
0dB/full-scale (0 dBFS) and just shoot for around -12 dB when
tracking?" Sure, you can do that too. I just think it's
psychologically pleasing to have your target level be aligned with zero,
parallelling the way analog gear has always functioned since the inception of
recording technology.
Some say more headroom is even better. Fab (Fabrice) Dupont spent a whole
session at this year's TapeOp Conference endorsing his methodology of tracking
with 18 dB of headroom. And, what does this headroom give you? Well,
Fab had a number of theories about this. I didn't agree with all of them,
but I think there's a lot of common ground. Firstly, you can finally stop
fretting about clipping your inputs and just
record. Subsequently, it gives you some headroom at the mixing
stage so you can avoid a ton of trim plugins everywhere in the session.
Nothing wrong with trims plug-ins -- leaving yourself some headroom in the first
place simply lets you work on the mix without constant worry about the ceiling.
More importantly, there's a general lack of conceptual "gain staging" in the
digital realm relative to the analog world. In analog, keeping optimal
signal levels running between your gear is important for maximizing
signal-to-noise ratio while maintaining acceptably low levels of
distortion. If your signal's too quiet, the noise can dominate. If
your signal's too hot, the analog circuitry can distort too much. Over its
operating range, modern digital equipment is devoid of many of analog's
shortcomings so somewhere along the way the concept of gain staging was dropped
too. A mistake in my opinion, that impedes work-flow at many levels.
Returning to my first question, why is digital full-scale always defined as 0
dB? I don't know exactly, but my
theory goes like this: In the early days of digital audio,
analog-to-digital converters weren't so great and to get to low quantization
noise and a good signal-to-noise ratio you had to slam the inputs, practically right up to
the last digital bit. I know firsthand as I've been using hard disk
systems since I first got involved in recording. My first multitrack recorder was a
Sunrize Audio AD516 card that ran in an Amiga 3000 computer [8 tracks of
playback on a 25MHz machine -- not too shabby :) ] Unfortunately, it had a fair
amount of hum and buzz injected from the computer. As a result, I always
had to maximize levels, resulting in a lot of false starts to recalibrate mic
levels and lost takes from audible clipping. It sucked, but in reality
full-scale | | was the target recording level
with first generation digital recording equipment, so naturally engineers
labelled it as 0 dB.
That's not the case anymore. Professional analog-to-digital converters are
usually pretty good these days. We can bring back the concept of headroom
again and not compromise audio fidelity. How would this improve
workflow?
1. Again, it lets you "set it and forget it" during the tracking stage.
You can finally stop anxiously watching those clip lights and actually listen to
the take.
2. Subsequently, it allows you to work more fluidly at the mix stage, by
avoiding constant input trimming.
Moreover, suppose the audio industry agreed on a "best practices" headroom
standard. How would this further improve work-flow? What other
benefits would it have:
1. Since more and more plugins are simulating the effects of analog distortion,
having an "optimal" operating point would make using this sort of plug-in more
intuitive and more consistent across different developers' offerings.
Wouldn't it be cool if you could throw on a tape simulation plugin and know that
by pushing the meter into the "red" the plugin is starting to do its
magic? You wouldn't have to fuss around learning how that particular
plugin is "gain staged" -- how to set its "drive" control or whatever mechanism
is uses to add more distortion. You simply know that by pushing the
levels to around zero dB and beyond, you'll start getting its flavor.
Sure, every plugin is going to have its own character and different distortion
onsets depending on the designer's motivations. But, things will at least
be in the ballpark again, just like the analog world.
2. I'm not a big fan of presets, but wouldn't it be nice if, for example, that
one kick drum compressor setting you made for that last session worked on this
new session without a lot of futzing around of the threshold control.
Right now, presets are usually absurd (for many reasons.) Very few mixing
processes are totally independent of input levels. And, the more that
plugins try to simulate analog distortion characteristics the more absurd
presets become. Establishing a recording reference point would at least
help make them a little less absurd.
3. It gives the recording community a "common language." Standardized
levels would allow engineers to work on different sessions by other engineers
without having to constantly adapt their methodologies to however "hot" the
session might be. Standardized levels would let people discuss plug-ins
and recording concepts in a more coherent manner, both in person and online.
4. Fight against the "loudness war." Maybe it's a stretch, but I theorize
that the lack of detailed metering and gain staging concepts in Pro Tools is in
some part responsible for the loudness war. The target was set to 0 dBFS, so at
every step of the way from tracking, to mixing, to mastering, "louder" has been
psychologically fostered by the lack of reference points. Maybe it was an
unstoppable trend altogether, but this shortcoming of the recording tool has
probably gotten us there quicker.
5. It eliminates the differences (and the needless debate) between
floating-point- and fixed-point-based digital recording systems.
Fixed-point systems have a hard and fixed "full-scale" maximum. Go above
full-scale and your signal is perfectly clipped off -- resulting in nasty
distortion. Floating-point systems have a "squishy" maximum level.
Go above "full-scale" and you simply get increasing quantization noise, but no
overt audible clipping. Since either system has more than sufficient
dynamic range below full-scale, there is absolutely no reason to push levels
beyond the typical 0 dB maximum. So, there's a simple resolution to this
inconsistency between the two systems (especially if you need to transfer sessions between
the two.) Don't go above full-scale! If your meters are smacking the
top, then chill out. Bring your levels down. Problem solved.
(I'm probably not the first guy to point out a lot of this stuff. Bob Katz
proposed the "K-System" meter quite a while ago. You can read about it on
his website here:
Katz Level Practices) | |
 | |  |
 | |  |
|
19 jul 2007 Quality |
|
|
I was interviewed for an Italian sound engineering magazine a while back. Here's one of the Q&A's:
How can you achieve efficient DSP/CPU usage in a plug-in while maintaining quality?
In my view,
quality is not exactly correlated with CPU usage. For
example, software can make a very high-quality gain stage with very
little CPU power. Changing the gain in software simply means doing a
single multiplication. It's very high quality in the sense that it is
extremely low in distortion, noise, etc. Changing the gain in analog
requires a complex array | | of transistors, capacitors, resistors, and a
intelligent design to keep distortion and noise low. What
differentiates analog from the current state of digital simulations is
uniqueness. Analog gear is very subtle and complex in its operation.
Modeling these subtleties in the digital realm is what requires so much
CPU horsepower. So, in short, it's uniqueness that's lacking in a lot
of plug-ins, not quality. When digital simulations don't
match the real gear very well, there are two potential causes:
the software designer either hasn't thoroughly discovered all the 'uniquenesses'
of the gear or they lack the CPU cycles to run their full model. | |
 | |  |
 | |  |
|
17 jul 2007 Monopolies |
|
|
Standards are cool. They make things more convenient and more efficient. If the recording industry would standardize on a single DAW sampling rate, that would be nice. I could write twice as many plugins with the same amount of effort. Beyond | | providing endless debates in message forums, there's no point in having six choices (44.1kHz, 48kHz, 88.2kHz, 96kHz, 176.4kHz, and 192kHz.) Let's pick one and run with it. The status quo is impeding the progress of "in the box" technology. | |
 | |  |
|
| | | | | | | |