VO Mastering In Pro Tools

Recording a great take is only part of the journey. Turning in a competitive audition will help create a sense of quality and keep you in the circle of auditions often recorded at better studios.

The key to great sound in Pro Tools actually comes from external choices you make before even opening the program. The ‘signal path’ – the journey audio takes from your microphone to computer –  makes a difference in the quality of recorded sound, but most important is the room in which you place the mic. There are many preferential tools to help you get your room right, blankets, closets, baffles, absorbers, diffusers and porta-booths. Each offer different solutions, to varying successful  degrees. In the voice over world, a dry (dead) sound is what we’re after, so eliminating room echos, reverb and reflections are important when attempting to achieve that rich, present and clean sound.

Second to the room in importance in achieving good sound is the hardware you choose. First in this equation is the microphone. A quality large-diaphragm condenser microphone is best, often of the cardiod pattern type. Beyond mic choice is the audio interface (sound card), the unit used to convert the microphones’ voltage to binary code (and way to get sound into/out of the computer). A nice clean signal chain, quiet room and practiced performance are what’s expected of the professional voice artist working from home.

So, now that we’ve got these variables nailed down, what can we do to further the quality of our audio? We can master it!

Pro Tools luckily comes with a variety of tools to help you achieve great sound. Beyond it’s editing capabilities, the software package includes some great sounding plugins as well! It’s these software processors that will be the focus of this tutorial.

So without further explanation, let’s get started.

First off, thanks to Victor Huzvar for providing the audio sample and his voice. Here’s the original.

Through this tutorial we’ll be processing audio through a string of plugins, one after the other. The effect of this is cumulative.

We’ll be working in this order

  1. Record Audio
  2. Trim Audio
  3. Delete breaths (offending ones)
  4. Consolidate Audio
  5. Equalize
  6. Normalize
  7. Expand
  8. Limit
  9. Rename and Export

The process above is involved, though after the tutorial I’ll show you how to automate some of the steps. Remember too that repetition makes you faster.

STEP 1 [Record Audio]

Record your audio as you’d normally do. Usually this is 1 region (clip) per read. Make sure in this step that you’ve set your preamp gain for your microphone to a healthy level, far enough away from the noise floor (ambient room noise) but not too far that you’re clipping. The farther the difference between noise floor and signal the better.

If you’d like to check your noise floor, record about 10 seconds while you’re absolutely silent. A noise floor around -45 db spl is ok, -50 is good, -60 is great and even lower is professional.

DB SPL refers to Decibel Sound Pressure Level and is counted in negative numbers from 0 DBFS (Decibels Full Scale)

From Wikipedia.org

Sound pressure level (SPL) or sound level is a logarithmic measure of the effective sound pressure of a sound relative to a reference value. It is measured in decibels (dB) above a standard reference level. The commonly used “zero” reference sound pressure in air is 20 µPa RMS, which is usually considered the threshold of human hearing (at 1 kHz).

To check your noise floor in Pro Tools, record about 10 seconds of complete silence. Make sure first to set your preamp to a level for a normal VO job, then hit record and be quiet. When finished, head up to the AUDIOSUITE menu and choose  OTHER and then the GAIN plugin.

Choose Gain Plugin
Audiosuite -> Other -> Gain

With the GAIN plugin instantiated, highlight a small section of audio where you’re completely quiet and choose ANALYZE, also making sure that PEAK is highlighted as the measurement option. Sometimes if the reading is off, say -5db or some other small reading, I’ll take three slices of audio and average them (Sample 1 + Sample2 + Sample3 divided by the number of samples taken (3))

Gain Plugin
Highlight small section of silent audio, make sure PEAK is set and hit Analyze.
My noise floor here is -36.8 DBFS

STEP 2 [Trim Audio]

Now that we’ve recorded a take, let’s trim the audio (called Topping and Tailing). In our cursor menu let’s select the TRIM tool.

Trim Tool

With the TRIM tool selected, let’s place it near the beginning of the audio region, then click and drag it to the beginning of our voice waveform.

Start at beginning of region, click and drag till the start of the VO read

Great, now we’ve topped our recording. Let’s do the same for the end of the region as well (Tailing). Once finished we’re left with a nice region that contains only our audio. TIP: make sure you don’t go too close the the end of the region, there’s a natural decay to voice and we don’t want to lose that. Also, the TRIM tool is non-destructive, meaning until we process the region, we can TRIM back the audio if we’d like.


If you’d like to bypass the TRIM tool, you can use a few keyboard shortcuts to achieve the same effect. Place your cursor near the beginning of the audio waveform and hit the ‘A’ key. This will cut all audio in the region up and until the cursor. To handle the other side of the region, we use the ‘S’ key, which cuts all audio from the cursor to the end of the region.

STEP 3 [Delete Breaths]

The next step is to go through the audio and pull out egregious breaths. Let’s face it, we all breathe. Often I hear people remark on how noticeable there breathing is, but this usually is a symptom of the ‘new-factor’ of hearing themselves back recorded through a microphone. Our ears are great at tuning out things like breathing, AC, hum and constant low level noise so it can be striking when we hear just how loud we can be. Well, breathing makes us normal (and it’s a good indicator that we’re alive). We often leave in the breaths that sound normal but delete the large ‘I-need-air-cause-this-is-a-run-on-sentence’ type of large suck ins.

Getting rid of breaths is fairly easy, with the cursor tool selected, zoom in on the region using the keyboard shortcuts [‘R’ = zoom out / ‘T’ -= zoom in] and let’s focus on the sections between the lines.

Focus in (zoom in) to the offending areas

Sometimes zooming in just doesn’t get us close enough to clearly examine the start and end points of the VO waveform. So, to aid us, we can vertically zoom using the audio zoom button up in the left top of the window. Clicking the up arrow part will zoom up vertically, likewise zoom down for the down arrow.

The audio zoom tool

So, with a few clicks of the up arrow, we’re able to now get in and accurately find and delete breaths.

Let’s select the breath and delete it.

Important to point out here is that the ends of words decay naturally in volume, so be careful to leave just a bit after the waveform when deleting audio. Conversely, words usually begin abruptly, with a ramp in the envelope of volume. So, deleting closer to the start is OK here.

Delete the breath. Leave a little at the end of words, closer to the beginning when cutting.

Finished deleting breaths, the audio in now in separate regions.

Some of you might have noticed I left the beginning of the audio, the room tone, intact. I actually just pulled the region out. Remember we’re working non-destructively up until now, so it’s as easy as clicking on the TRIM tool and pulling out the start of the take. I do this because in a future step we’ll need to get a reading on our noise floor.

Now that I’ve deleted breaths, let’s SHIFT + CLICK all the regions, from the first to the last. Make sure you’ve selected the GRABBER tool so that when we select the regions, we’re actually selecting all the audio. The main SELECTOR tool will allow us to grab up until where we put the cursor, not the entire region; the GRABBER tool is a better choice here.

We use the GRABBER tool to select — and — move regions

With all of our audio highlighted, let’s perform a batch fade. Doing this will clean up some of the cuts we’ve made. Use the shortcut COMMAND + F.

A new dialogue box will pop up with options for the batch fade. The only thing here we’ll be changing is the length, let’s choose 35 ms for the length of each fade.

The BATCH FADE window. Choose 35ms here for length.

As you can now see, Pro Tools has batch faded the beginnings and ends of the regions.

Batch Fades

STEP 4 [Consolidate Audio]

Now that we’ve faded our audio, let’s consolidate all those separate regions into one. Up in the menu bar under EDIT, click on the option CONSOLIDATE CLIP (Region for Pro Tools 9).

Consolidate Audio

Quick Tip:

We could have also used the keyboard shortcut SHIFT + OPTION + Numeric 3. Mind you, the already needs to be selected.

Consolidated Audio

STEP 5 [Equalize]

It’s always a good idea to get rid of low end noise. Frequencies below a certain Hertz range really aren’t audible for our purposes. This already might have been taken care of by a low-cut filter on the mic or audio interface, but it’s always a smart decision to manually get rid of them anyways. We can also push the frequency cutoff up a little higher as well, cleaning up some muddiness and room rumble. To aid us in this task we’ll use the included (and great) EQ3 1-Band plugin.

With your audio selected, let’s head up to the AUDIOSUITE menu and drill down to EQ and then EQ3 1-Band

EQ3 1-Band
Audiosuite -> EQ -> EQ3 1-Band

Great, with that plugin instantiated, a good place to start is 90 Hertz using the FREQUENCY control and the HIGH-PASS option. Leaving the 12 db per octave is ok, so is the input at 0.

Frequency at 90 Hz and High-Pass filter style enabled

If you’d like, with headphones on, you can raise the Hertz (Frequency) cutoff level up till you begin to hear loss of low end, dropping it back until you don’t. Sometimes 120 Hz will work, sometimes more. Remember, you’re equalizing these frequencies out of the audio and if you’re concerned about heft (bass), then don’t raise the bar too high.

Cutoff Frequency at 120 Hz

Let’s hit process (render) and complete this step. You’ve now successfully low-cut your audio and removed low end unwanted frequencies. This aids in clarity and punch, leaving room for the top end to breathe.

STEP 6 [Normalize]

This is a fairly simple step. Effectively, Normalizing audio, raises the overall level of volume to a threshold the user sets. By instantiating this process, the computer will go through all the samples of the waveform and calculate the highest point, then move that point to the threshold level set by you, along with all the other audio in relation to that peak. Contrary to what some people have said, normalizing does NOT ruin your audio, nor does it compress it. Better yet, by choosing this type of plugin over a gain or amplify variant, you may push the audio to 0dbfs without fear of over-modulating (peaking) above 0dbfs.

The question always arises when speaking about the threshold for this plugin. Standards dictate that we normalize to -3 dbfs. I agree in many situations, however, if this audio is destined as an audition, normalizing to -1 dbfs is quite alright.


Well, the playback environment on the other side of this email your sending isn’t qualitatively known. Likely, the speakers used are small, hooked up poorly or worse yet, attached to a laptop (internal). In this case, louder is always better. We’re concerned with cutting through the noise, figuratively and literally. Because the Normalize plugin can’t over-modulate the audio, -1 dbfs is fine in this case – also, no further mastering is being done to this audio. I also prefer to normalize to -1 due to later creation of an MP3 file. There are some small instances when creating MP3 files increases the volume just a tad and we wouldn’t want anything to head into the zone of clipping, so to be safe here, -1 is a great place to hit.

When is it NOT ok to Normalize, or Normalize to 0 dbfs?

When the client specifically asks for it, the job requires it or more often, it’s going to an engineer for further processing or mastering. In these cases, leaving it all alone is perfectly fine.

So, without further explanation, let’s go grab that plugin and Normalize our audio. Head up to AUDIOSUITE, then OTHER and finally select NORMALIZE.

Normalize Plugin

Hears our plugin interface, notice the threshold slider, likely it’s already set at 0 dbfs. We can lower the level but important to realize, we can’t go above 0 dbfs hence not being able to over-modulate our audio.

Normalize Plugin

And, here’s our audio after the process.

After Normalize. Notice the name change.

Notice, by doing this, we’ve changed the name of the audio region, it’s now appended with NORM, stating to us it’s been normalized. Also importantly, 2 steps ago we consolidated our audio. It was at that point that we lost the ability to go back and non-destructively pull out audio with the TRIM tool. We’re now left with a single region.

STEP 7 [Expand]

We now have a nice single region, free of room rumble and low-end unwanted noise. We’ve also raised our levels with the NORMALIZE plugin and consolidated our audio. Now we need to treat the audio by reducing the noise floor.

When audio people refer to noise floor, they mean the ambient noise inherent in the room the recoding was made in. This is air, hum, AC, birds, cars, refrigerators, neighbors and system noise. It can be distracting at the very least, and unprofessional when loud. We have up our sleeve, an included plugin to aid in pushing that noise floor farther away from our audio than it is now. That process is called EXPANSION.

In effect, an expander used a threshold, much like a Normalize plugin. But where the Normalize plugin RAISES level TO the threshold, expanding LOWERS the audio BELOW the threshold. If we set our expander threshold to just above our noise floor, the plugin will reduce the audio volume of the take when it falls under the threshold, thereby reducing the noise floor. Consequently, if we set the threshold above our audio volume then it’ll reduce all audio by an amount we set – NOT good.

To set the threshold we need an area of noise floor we can highlight so the plugin can repeat the leveling and show us where to set that bar. First, we need to instantiate the expander. Again, with the audio region selected, head up to AUDIOSUITE, then DYNAMICS and finally DYN3 Expander/Gate.


The threshold is the orange dial and even more helpful, the orange right-facing arrow next to the input level.

Expander Plugin
Threshold level dial & arrow

For this process we need to select some noise floor. This is why I left a little audio int he beginning if the region.

Noise Floor I Kept

Let’s select a bit of that and click the PREVIEW (speaker icon) button in the lower left of the EXPANDER plugin.

Highlight region of noise floor, click preview or speaker icon button

At this point our audio should be looping playback, indicated by sound coming from our speakers and a little dot bouncing in the main window of the EXPANDER. The key to the next step is to set the threshold just above the noise floor. To do this, we grab and lower the right-facing arrow with our mouse and bring it down until it’s hovering just above the input level, indicated by a green line.

Bring the right-facing arrow just above the noise floor, indicated by the green input.

If your green level indicator isn’t solid, or near solid, then a better way to set you threshold is to lower the level until the little bouncing square in the main graph window is mostly not disappearing from view. You do this by raising the threshold a large amount, watching the bouncing square do it’s business, then slowly bring it down till it’s more disappearing than showing.

Once the level is set, DO NOT render (process) the audio. Doing this will just process the selected area of the noise floor. Uncheck the PREVIEW (speaker icon) and now select the entire audio region. Now, hit render (process).

Below is a picture of the audio before the expander plugin processing, note that I’ve zoomed in vertically to show you the noise floor.

Before Expansion

After Expansion.

After Expansion

And now, with the audio processed, let’s delete that noise floor at the beginning of the region.

We’ve now lowered the noise floor by using an expansion plugin. The default settings were fine for our purposes. Raising the ratio gives us a more drastic difference from the threshold and while it’ll work in some scenarios, often a stark difference between audio and noise floor can be jarring.

STEP 8 [Limit]

Our audio is sound good. But, there’s one last process we can use to help us achieve that great take. And, that is to reduce the dynamics and increase the quieter parts, bringing the audio closer together in volume. We do this with a LIMITER (or Compressor). Pro Tools comes with a fine plugin called MAXIM, we’ll be using this to handle our limiting.

Head up to AUDIOSUITE, then DYNAMICS and finally MAXIM.

Maxim Plugin

Hear again, we’ll see a threshold level, but also a ceiling level too. The threshold here is like the threshold level on the EXPANDER, though instead, when hit, the plugin will reduce the volume level of any audio that goes above it. The ceiling level is where no audio shall pass, a brick wall if you will.

I usually set the threshold at or around -3 dbfs and the ceiling at -.5 dbfs. We could compress more by pulling down the threshold level, but I’m not interested in squashing out audio, nor is this plugin effective enough to transparently handle that kind of process. We ARE interested however, in slightly reducing the dynamic range (volume range) of our take. Using this type of plugin will aid us in keeping peaks in check while also increasing the ‘punch’ of the audio.

Limiter Plugin
-3 dbfs Threshold –and– -.5 dbfs Ceiling

And below is a side by side our audio. No limiter on the left, Limited on the right. Note, you won’t be creating two regions, this is for demonstration only. You should go about rendering (processing) only the one region you have.

Maxim before and after
Unprocessed on left, processed on right

STEP 9 [Rename & Export]

After we’ve processed our audio, we need to rename and export it. To rename, select the GRABBER tool and double click the audio region.


I usually name the region by spot name, appending my name to the file too. Always a good idea to put your name on the audition. Never know how many audio regions your agent will get. Following a naming spec is good, but keep in mind the request by the client, often they might have you follow a naming convention they choose. Keep an eye on the spot documentation for rules.

Once our audio is named, it’s time to export the region so we can email it. We do this using the keyboard shortcut SHIFT + COMMAND + K. Unfortunately this isn’t an option in the menu bar so you’ll need to memorize it.

Export Dialog

There are several parts of the export dialog that are important. First off, the codec to use. WAV for unprocessed, MP3 for processed (compressed). We use MP3 for auditions. Second, MULTIPLE MONO or STEREO. Being a mono track, we’ll choose multiple mono. The next important option is where we want to save the file to. This is the Choose button.

Once these options have been selected, hit export. If you’ve decided to use WAV, you’re done. If you chose MP3, the next dialog box is where you’ll choose the BIT RATE.

MP3 Export Dialog

In this dialog you’ll be able to choose the BIT RATE and whether we process better or faster. For normal auditions, a BIT RATE of 128kbps is fine. You might want to choose a higher rate if the client asks for it, but 128 should work well for our purposes. Conversely, it’s not a good idea to compress more than 128kbps (lower than this rate). I’m not sure how effective the faster vs. better option is – both are very fast. So, I usually choose Slowest as my option.

And, Wall-A. We’re done!

This process might seem long and drawn out but trust me when I say you’ll get very quick at this indeed.

You’ve now recorded and mastered an audition. Congratulations. No go have a beer and wait for the call back.


So, how about making this process faster. Well, there are a few ways to streamline the process. Each plugin can recall presets, and if auditioning is what you do most, setting these up can help speed things up. Also, pulling your plugins up in the list of processors to the top of the Audiosuite menu will help too. Let’s do that first.

The plugins we’re concerned with are

  • EQ3 1-Band
  • Maxim
  • DYN3 Expander/Gate
  • Normalize

Hold down the COMMAND key and go instantiate those one by one. By doing this we’ll fast track the plugin listing to the top.

Before this process

Before Promoting Your Plugins In The List

After this process

After Promoting Your Plugins In The List

Great, now it’s easier to get at the plugins we use most often. Now, let’s set some processing presets. We’ll be doing this for EQ3 1-Band, DYN3 Expander/Gate and Maxim

First EQ3 1-Band

Open the plugin and set your frequency cutoff and filter type for what you found best during this tutorial. I’ll set mine to 90 Hz and I’ll be using the HIGH-PASS filter type. Then, we’ll be saving this preset. After that we’ll be naming the preset, setting it as user default and finally setting the plugin to user default. Here’s a series of screenshots in order to aid in doing this.

Set Up EQ

Save Settings As…
Choose this up int he arrow up the top left of the plugin.

Name The Preset

Save preset as User Default

Set Plugin to use User Default

Now, every time we open the plugin, it’ll be set to the default we chose. Consequently, if we omitted the last step, we could use the drop down menu in the plugin to choose between several of our presets (had we set those up).

Let’s repeat this process for the other plugins as well. Open each, set your preferences and then follow the Save Settings As… dialog.

This works great and helps us streamline the process, however, the Expander preset we’ve created is based on the noise floor at the current time and place. Noise floor’s change based on traffic (cars) levels, family noise and of course change based on location. So, saving a preset for STUDIO AT HOME is great, but keep in mind that if you move locations, your noise floor will change and you’ll need to get a manual reading on your noise floor to use the plugin effectively.

Another step we can bypass to save time is the Consolidate step. By jumping straight to the EQ step, when we process using that plugin, a by product of it’s process is a natural consolidation of the regions. So, int his step we select the disparate regions, use our preset and hit render (process), the plugin pulls frequencies below the cutoff and consolidates the audio. Yay!.

There are many ways to handle this process, real time, part real time and offline (which is what were doing here). Each offers it’s own benefits. One thing however that’s similar to all these is the auto creation of new wav files at each step. During the process the session will continue to grow. We can purge our session of unwanted audio files by using a great keyboard shortcut.

SHIFT + COMMAND + U to select the regions (clips) in the bin to the right that DON”T appear on the timeline. Once selected, we can then use another shortcut to remove or delete those files.

SHIFT + COMMAND + B brings up a dialog box to remove or delete files. Removing pulls them from the session, leaving them in the audio files folder on the hard drive, DELETE actually deletes them from disk. Warning: if you delete files not in the main edit window, you lose the ability to go back and reuse parts of the audio file that don’t show there so be sure when using this process you’re clear that what you have in the main edit window is all you need.

Remove / Delete Menu

Delete Warning