Shockwave-Sound Blog and Articles
Depth and space in the mix, Part 2

Depth and space in the mix, Part 2

by Piotr Pacyna

< Go to part 1 of this article

So, how to start?

With a plan. First off, I imagine in my head or on a sheet of paper the placing of individual instruments/musicians on the virtual stage and then think how to “re-create” this space in my mix. Typically I’d have three areas: foreground, mid-ground and background. Of course, this is not a rule. If we make a raw rock mix with a sparse arrangement and in-ya-face feel we don’t need much of a space, on the other hand, in a dense, multi-layered electronica the depth is crucial.

So, I have divided all my instruments into, say, 3 spatial groups. Then, in my DAW, I set the same colour for every instrument belonging to the certain group, what is wonderfully handy – I immediately see everything at a glance.

The tracks that I usually want to have close are drums, bass, vocals. A bit deeper and further I’d have guitars, piano, strings. And then, in the distant background I’d have synth textures or perhaps some special vocal effects. If there are string or brass sections in our song, then we need to learn about placing the orchestra instruments first in order to reproduce it. Surely this is the case only if we are aiming for realism.

But sometimes we don’t necessarily need the realism, especially in electronic music. Here almost anything goes!

Going back to our plan…

No matter whether we struggle for realism or not I suggest to start planning from pinning down which element will be the furthest – you need to identify the “back wall” of the mix. Let’s assume that in our case it is a synth pad. From this point, any decision about placing instruments closer or farther away has to be based on our back wall.

At this point we have to decide what reverb will we use. There are basically two ways of thinking. Traditionalists claim that we should use only one reverb in the mix, not to give misleading information to the brain. In this case we have the same reverb on each bus (in terms of the algorithm), changing only the settings – especially pre-delay, dry/wet ratio and EQ. Those of a more pragmatic nature believe that it’s not always the realism that matters, especially in electronic music and only the end result counts. Who are right? Well, they both are.

I’d usually use two, maybe three different reverb algorithms . First would be a short room type of reverb, the second, longer, would be Plate, and the third and farthest would be Hall or Church. Thanks to using the sends from individual tracks I can easy decide how far or close the instrument will sit on our virtual stage.

Do not add a reverb to each track, the contrast will allow you to enhance dimension and imaging even more. If you leave some tracks dry, the wet ones will stand out.

Filtering out the highs from our returns not only sinks things back in the stereo field, but also helps to reduce the sibilants – reverb tends to sort of spread them out in the space, what is very irritating. An alternative method of getting rid of sibilances from reverb is to use de-Esser on the sends.

Compression and its role in creating a depth

Can you use compression to help creating the depth? Not only you can, but you must!

As we all know, the basic use of the compressor is to “stabilize” the instrument in stereo field. It means that compressor helps to keep the instrument the same distance away from the listener thorough the whole track. To put it another words – its relative volume is stable. But of course we don’t always need it to be. This is particularly important for instruments placed back on a sound stage, because otherwise these will not sound clear. Now, how the compression can help us here? As we all know, the unwritten rule says that the gain reduction should not exceed 6 dB. This rule works for instance for solo vocals. The bigger reduction can indeed “flatten” the sound. Yet this is not necessarily the case when it comes to backing vocals or, generally, instruments playing in the background. Sometimes these are getting reduced by 10 dB or even more. In a word – everything what is further away from the listener should be compressed heavier. The results may surprise you!

There is one more thing I advise to pay attention to – two basic work modes: RMS and Peak. PEAK mode is “looking” at peaks and reduces the signal according to it. What sound does it give? In general – more squeezed, soft, sometimes even pumping. It’s useful when we want the instrument to pulse softly rather instead of dazzling the listener with its vivid dynamics. The RMS mode causes the compressor to act like the human ear and not focusing on signal peaks that often have little to do with the perceived loudness. This gives a more vibrant, dynamic and more natural sound. It works best if our aim is to preserve the natural character of the source (and that’s often the case for example with the vocals). RMS mode gives a lively, more open sound, good for pushing things to the front on our sound stage.

The interesting fact is that built-in channel compressors in SSL consoles are instantly switchable between Peak and RMS modes. You can find something similar in the free TDR Feedback Compressor from Tokyo Dawn Records.


Another very popular effect is delay. It is, one might say, a very primitive form of reverb (as reverb is nothing more than series of the very quick reflections).

As you may remember from the earlier part of this article, I mentioned the pre-delay parameter in reverb. You can use it in pretty much the same way in delay plugin to create the sense of depth in the mix. Shorter pre-delay times will make instruments sound further away from the listener, longer times will do the opposite. But you can of course use the delay in many different ways. For instance – very short reflection times with no feedback can also thicken and fatten the sound nicely. Try it!

The thing I like the most in delay is that it gives the mix a certain context of space. The music played in an anechoic chamber would sound really odd to us, as we hear all sounds in a context already from birth (the situation is of course no different with the music). No matter if you listen to a garage band, a concert at the stadium or in the club – context of the place is essential to an appreciation of space in which the music is playing.

Now, how to use all this knowledge in practice

And now I will show you how I use all of this information in practice, step-by-step.

1. The choice of reverbs.

As I said before, the first we have to consider if we aim for realism or not.

I always struggle when it comes to reverb. Like, what the best sound settings for what instrument/sample. Should I use a Hall or a Plate? Should I use an aux or use it as an insert. Should I EQ after or before the reverb etc. I don’t know why, but reverb seems to be the hardest thing for me to understand and I wish it was not.

And then comes another big question. How much reverb should be applied to certain tracks? All decisions made during the mixing process are based on what makes me feel good. One good advice is to try monitoring your mix in mono while setting reverb levels. Usually, if I can hear a touch of it in mono it will be about right in stereo. If I get it too light in stereo, the mix will sound too dry in mono. Also – concentrate on how close or distant the reverbated track sounds in the context of the mix, not on how soft or loud the reverb is (a different perspective).

2. Creating the aux tracks including different reverb types.

3. Organizing the tracks into different coloured groups.


At the top of the session I have a grey coloured group – these are the instruments that I want to have really close and more or less dry: kick, bass, hihats, snare, various percussion loops. I have Room reverb going on here, but it is to be felt, not heard.

Then I have the blue group. These are the “second front” instruments with Hall or Plate type reverb on them.

And then I have the background instruments, the back wall of my mix. Everything that is here is meant to be very distant: synth texture, vocal samples and occasional piano notes.

4. Pre-delays, rolling off the top, the bottom, 300Hz and 4500 Hz.

My example configuration would look like this:

  • Room: 1/64 note or 1/128 note pre-delay, HPF rolling off from 200 Hz, LPF from 9 kHz
  • Plate: 1/32 note or 1/64 note pre-delay, HPF rolling off from 300 Hz, LPF from 7 kHz,
  • Hall: no pre-delay, HPF rolling off from 350 Hz, lowpassing is usually quite low, in the 4k – 5k zone (remember the air absorbs high frequencies much more than it absorbs lower ones).

5. Transients

The distance eats transients. And attenuates the direct sound, the first arrival of the initial transient. But the reverberation picks up and amplifies the steady, tonal part of the sound. The distant sound is much less transient-laden, far smoother, far more legato, far less staccato, less “bangy” and “crunchy,” than close-up sound. It is also harder to understand the words at a distance. That’s why I often compress the longest reverb to flatten or to get rid of transients. Set a fast attack if you want there to be less of a transient at the start, and the parts to be squashed more. I also use a transient designer (such as freeware FLUX Bittersweet) and move the knob anticlockwise
to soften the attack a little.


  • Foreground: drums, percussion, bass and saxophone.
  • Mid-ground: piano, acoustic guitar.
  • Background: synth pad, female voice.


For a long time I had the tendency to put way too much reverb on everything. You know, I thought I would get the sense of depth and space this way, but I was so wrong… Now I know that if we want one track to sound distant, another must be very close. The same goes to volume and every other aspect of the mix – to make one track sound loud, others need to be soft and so on.

There are some more sophisticated methods that I haven’t tried myself yet. Like a smart use of compression for instance. Michael Brauer once said: “I’m using a lot of different sounding compressors to give the record depth and to bring out the natural room reverbs of the instruments”.

Some people also get nice results by playing around with Early Reflections parameter in reverb. The closer a sound source is to boundaries or large reflective objects within an acoustic, the stronger the early reflections become.

Contrast and moderation – I want you to leave with these two words and wish you all a successful experimenting!

About the author: Piotr “JazzCat” Pacyna
is a Poland based producer, who specializes in video game sound effects
and music. He has scored a number of Java games for mobile phones and, most
recently, iPhone/iPad platforms. You can license some of his tracks here.

Depth and space in the mix, Part 1

Depth and space in the mix, Part 1

by Piotr Pacyna

“When some things are harder to hear and others very clearly, it gives you an idea of depth.” – Mouse On Mars

There are a few things that immediately allows one to distinguish between amateur and professional mix. One of them is depth. Depth relates to the perceived distance from the listener of each instrument. In amateur mixes there is often no depth at all; you can hardly say which instruments are in the foreground and which in background, simply because all of them seem to be the same distance from the listener. Everything is flat. Professional tracks, in turn, reveal an attention to precisely position individual instruments in a virtual space: some of them appear close to the listener’s ear, whilst others hide more in the background.

For a long time people kept telling me that there was no space in my mixes. I was like: guys, what are you talking about, I use reverbs and delays, can’t you hear that?! However, they were right. At the time I had problems understanding the difference between using reverb and creating a space. Serious problems. The truth is – everyone uses reverbs and delay, but only the best are able to make the mix sound like it was three-dimensional. The first dimension is of course panorama – left/right spread that is. The second one is up/down spread and is achieved by a proper frequency distribution and EQ. The third dimension is the depth. And this is what this text is going to be about.

There are three main elements that help to build the depth.

1. Volume level

The first, the most obvious and pretty much self-explanatory is Volume Level of each instrument track. The way it corresponds to the others allows us to determine the distance of the sound source. In a situation where the sound is coming from a distance, its intensity is necessarily smaller. It is widely accepted that every time you increase the distance twice the signal level is reduced by 6 dB. Similarly, the closer the sound you get, the louder it appears.

It is a very important issue that often gets forgotten…

2. Time the reflected signal needs to reach our ears

The second is the time taken by the reflected signal to reach our ears. As you all know, in every room we hear a direct signal and one or more reflected signals. And if the time between these two signals is less than 25-30ms, then the first part of the signal gives us a clue as to the direction of the sound source. If this difference increases to about 35ms or more, the second part of signal gets recognized by our ears (and brain) as a separate echo.

So, how to use it in practice?

Due to the fact that the PAN knobs are two-dimensional and keep moving from left to right, it’s easy to fall into the trap of habitually and set everything in the same, dull and obvious way – drums here, piano there, the keys here… as if the music was played in a straight line from one side to the other. And we all know that is not the case. When we are at the concert we are able to hear a certain depth, “multidimensionalism” quite brilliantly. It is not hard for us to say, even without looking at the scene, that drummer is located in the back, guitarist slightly closer to the left side, and the singer is in the middle, at the front.

And although the relative loudness of the instruments is of great importance for creating the real scene, it’s the time the signal needs to arrive to our ears that really matters here. These very tiny delays between certain elements of the mix get translated by our brain into meaningful information about position of sound in space. As we know, sound travels at a speed of approximately 30cm per 1 millisecond. So if we assume that in the case of our band the snare drum is positioned 1.5m behind the guitar amps, then snare sound reaches us 5ms later than the signal from the amplifier.

Let’s say that we want to make the drums sound like they were standing at the end of the stage and near the back wall. How to do that? When setting reverb parameters remember to pay attention to ‘pre-delay’. This element allows us to add a short delay between the direct signal and the reflected signal. It somehow separates the two signals, so we can manipulate the time, after we’ll hear the echo. It’s an extremely powerful tool in creating a scene. Shorter pre-delay means that the reflected signal will be heard almost immediately after the appearance of the direct signal; actually the direct and the reflected signal will hit our ears almost at the same time. And longer pre-delay, however, moves the direct signal away from the reflective surface (in this case the rear wall). If we set a short, few ms delay to the snare, longer one for the guitar or even longer for the vocals, it would be fairly easy for us to catch the differences. Vocals with a long pre-delay sound a lot closer than the snare drum.

We can also play along with pre-delay when we want to get a real, natural piano sound. Let’s say we place the piano on the left side of our imaginary stage. When sending it to a stereo reverb let’s try to set a shorter pre-delay for the left channel of the reverb, because in reality the signal would bounce back from the left side of the stage (from the side wall) first.


First we have a dry signal. Then we are in a (quite big) room, close to the drummer. And then we are in the same room again, but this time the drummer is located by the wall, far from us.

3. High frequency content

The third is the high-frequency content in the signal. Imagine that you are walking towards the concert in the open air or a pub with live music. What frequency do you hear most of all? Of course the lowest. The closer to the source of music we are, the less dominant “basses” are. This allows us to conclude that the less high frequencies we hear, the further the sound source is, hence a fairly common practice that helps to move the instrument to the background is a gentle high frequencies rolling off (instead of bass boost) by LPF (low pass filter).

I often like to additionally filter the returns of reverbs or delays – the reflections seem to be more distant this way, deepening the mix even more.

Speaking of bands, we should also pay attention to frequencies somewhere around 4-5kHz. Boosting them could “bring up” the signal to the listener. Rolling them off will of course have the opposite effect.

“It is totally important when producing music that certain elements are a bit more in the background, a bit disguised. It is easiest to do that with a reverb or something similar. In doing that, other elements are more in focus. When everything is dry, in the foreground, it all has the same weight. When some things are harder to hear and others very clearly, it gives you an sense of depth. And you can vary that. That is what makes producing music interesting for us. To create this depth, images or spaces repeatedly. Where, when hearing it, you think wow, that is opening up so much and the next moment it is so close again. And some times both at the same time. It is like watching… you get the feeling you need to read just the lense. What is foreground and background, what is the melody, what is the rhythm, what is noise and what is pleasant. And we try to juxtapose that over and over again.” (Mouse on Mars)

Problematic Band

All modern pop music has one thing in common: it is being recorded at a close range, using directional microphones. Yes, you’re right, it’s typical for near-field recording. This is the way how most instruments are recorded, even those you don’t normally put your ears to – bass drum, toms, snare, hihat, piano (anyone puts his head inside the piano to listen to music?), trumpet, vocals … And yet even musicians playing these instruments (and for sure the listeners!) hear them from a certain distance. Musicians too – it’s important. That’s the first thing. And second – the majority of studio and stage microphones are cardioid directional close-up mikes. Okay, these two things are a quite obvious, but you’re wondering what is the result? It turns out that we record everything with the proximity effect printed on tracks! Literally everything. In short, the idea is that directional microphones pick up a lot of the wanted sound and are much less sensitive to background noise, so the microphone must handle loud sounds without misbehaving, but doesn’t need exceptional sensitivity or very low self-noise. If the microphones get very close to the sound source – within ten or so microphone diameters – there’s a tendency to over-emphasise bass frequencies, but between this limit and 100 cm maximum limit, frequency response is usually excellent.

Tracks with the proximity effect printed sound everything but natural.

Everyone got used to it and even for the musicians their instruments recorded from a close distance sound okay. What does this mean? That all of our music has a redundant frequency hump around 300Hz. Some say that it’s rather 250Hz, others that 400Hz – but it’s more or less there and it can be
concluded that almost each mix would only benefit from taking a few dB’s off (with a rather broad Q) from the low-mids.

Rolling off these freq’s will make the track sound more “real” in some way and it’s also actually something common on the sum. The mix gets cleaned up immediately, loses its muddyness and despite the lower volume level it is louder. Low mid appears to not contain any important musical information.

And this problem affects not only the music being recorded live – the samples and sounds are all produced as customers want it and it means they are “compatible” with the sound of the microphone. So it is worth to get familiar with that issue even if you produce electronic music only.

The bottom line is: if you want to move the instrument to the back – roll off the freq’s around 300 Hz. If you want to get it closer, simply add some extra energy in this range.

Continue to part 2 of this article >

About the author: Piotr “JazzCat” Pacyna
is a Poland based producer, who specializes in video game sound effects
and music. He has scored a number of Java games for mobile phones and, most
recently, iPhone/iPad platforms. You can license some of his tracks here.

Cleaning up noisy dialogue: Get rid of background noise and improve sound quality of voice recordings

Cleaning up noisy dialogue: Get rid of background noise and improve sound quality of voice recordings

Tools and techniques for removing unwanted noise from vocal recordings

by Richie Nieto

One of the biggest differences between film and documentary sound versus animation and video game sound is that, usually, in films and documentaries, the recording environments are not fully controlled and often chaotic. When shooting a scene in the middle of a busy street intersection, for instance, the recorded audio will contain much more than just the voice of the subjects being filmed. Even in a closed filming environment in a “quiet” set, there is a lot of ambient noise that will end up in the dialogue tracks.

Traditionally, in the film world, dialogue lines that are unusable due to poor sound quality are replaced in a recording studio, in a process called ADR (Automated Dialogue Replacement). Documentaries don’t have the same luxury, as interview answers are not scripted and the interviewed subjects are not actors (and wouldn’t be able easily duplicate their own previously spoken words accurately in the studio). There are also issues of budget limitations – ADR is expensive, and most small-budget productions can’t afford to replace every line that needs to be replaced.

So, the next best option is to clean up what is already recorded, as best as we can. I’ll explain some of the tips and techniques to ensure that you get the most out of the material you have. As a brief disclaimer, keep in mind that some of these techniques are divided up between dialogue editors and re-recording mixers on most professional-level projects, so if you’re not mixing, make sure to consult with your mixer before doing any kind of processing.

As an example of bad-sounding dialogue, we have the following clip:


This clip has a number of problems (aside from the poor performance by yours truly). There is hum and hiss in the background and, due to improper microphone technique, loud pops from air hitting the diaphragm too hard and the voice sounds very boomy. This would be an immediate candidate for ADR. However, we will assume the budget doesn’t allow for it to be replaced, or the actor is not available. I’ve had situations where the actor just doesn’t want to come into the studio to do ADR, even though it’s in their contract, and no amount of legal threats will convince them otherwise, so the only course of action in those cases has been to make the bad-sounding lines good enough to pass a network’s quality control.

Okay, so let’s get to it. The first step is to filter out some of the boominess with an EQ plugin. For this example, I’m just using one of the stock plug-ins in ProTools. All of the processing here is file-based, as opposed to real time processing, mostly to be able to show how each step affects the clip. 

By listening and a bit of experimenting, we can hear that there is a lot of bass around 100 Hz in our audio file. Here’s how the clip sounds after removing some of the offending low frequency content:


Next, we’ll use a noise reduction plug-in to get rid of some of the constant background noise. There are plenty of other options in the market, but I’ll use Waves’ X-Noise for this example. The trick here is to not go overboard; if you start to hear the voice breaking down and getting “phasey”-sounding, you need to pull back on the Threshold and the Reduction parameters. You won’t get rid of all the noise with this step, but I find it yields better results to use moderate amounts of processing in different stages instead of trying to cure the problem by using a single tool.

After having the plug-in “learn” the noise and then adjusting the Threshold, Reduction, Attack and Release parameters, we process the clip, which now will sound like this:


There is still a fair bit of background noise in there, so now we’re going to use a multiband compressor/expander to deal with it. In this particular case I’ll use Waves’ C4, but, once again, there are many equivalent plug-ins to choose from. I am just very familiar with the C4 and how it behaves with different kinds of sounds.

We need to set the parameters for expansion, which does the exact opposite of compression: it makes quiet things quieter. That’s why we apply it after the noise reduction plug-in, so that the noise level is much lower than the voice when it goes through the expander. A normal single-band expander will not work as well because the noise lives in different areas of the frequency spectrum, and those need to be addressed independently with different amounts of expansion.

Now there is a vast improvement on the noise level on the clip, as we can hear:


Okay! The following step is to tackle the pops caused by the microphone’s
diaphragm being slammed hard by the air coming out of my mouth. Obviously,
this is a problem caused by bad planning, and it is replicated here to
illustrate a very common mistake in recording voiceovers. In this case,
the most offending pops are in the words “demonstrate” and

A solution that has worked really well for me many times is actually
very simple. It involves three quick steps. First, select the part of
the clip that contains the pop, and be sure to include a good portion
of the adjacent audio before and after the pop in the selection.

Use an EQ to filter out most of the low end of that selection. This will automatically create a new region in the middle of the original one.

Then crossfade the resulting regions to eliminate any clicks and to smooth the transitions. You will need to experiment with the crossfades’ proper positions and lengths, and the exact frequency and the amount of low end content to be removed, based on the severity of the pop.

And now, without those loud bumps, our clip sounds like this:


Finally, I do a second pass through the multiband expander to remove the rest of the noise, and some EQ tweaks to restore some of the brightness lost in the process.


If you compare the first version of or audio file to this last one, you’ll hear the huge difference that is accomplished in sound quality by using several different steps and combining tools and tricks. As you know, there are always better and more affordable software applications being created for dealing with noisy audio, and some of the newer ones are able to cover several of the stages that I’ve described here. Others, which use spectral analisys algorithms, can even isolate and eliminate incidental background noises that happen at the same time as the dialogue, like a glass clinking or a dog bark. So the game is constantly changing.

In closing, hopefully this article will serve as a guide on how to tackle some problems with audio material that, for any reason, can’t be recorded again. It’s by no means a definitive approach to eliminating noise, since the number of variables and tools out there is staggering. So experiment, and have fun!

About the author: Richie Nieto has been a professional composer and sound designer since the early nineties. He has been involved with projects for DreamWorks, Lucasfilms, Dimension Films, Sony Pictures, HBO, VH1, FOX Sports, Sony Music, BMG, EA, THQ, Harmonix and many more of the biggest companies in the entertainment industry. His work can be heard on many commercially-released CDs, feature films, documentaries, video games and over 30 television series for the U.S. and Canada. Recently, Richie has composed music and/or designed sounds for projects like EA’s “Nerf N-Strike”, “Nerf 2: N-Strike Elite”, “Littlest Pet Shop”. “Littlest Pet Shop Friends”, and THQ/Marvel’s “Marvel Super Hero Squad”. He also finished work on Ubisoft’s “James Cameron’s AVATAR: The Game” and is currently a contract composer and sound designer for EA. VIsit Richies website at