Semi OT: De-Essing - How do you do it?

amendegw wrote on 6/21/2011, 5:36 AM
Over in this thread: Virtual Choir Laurence brought up the subject of De-Essing. Rather than hijacking that thread I thought I'd start a new one. This probably should be in the Vegas Audio or Sound Forge Forum but the Vegas Pro forum gets so much more traffic. So, here's my question:

I'm very much a noob in this area, but De-essing is always something that has frustrated me. I've found a "Spitfish De-esser", but I can't see that in makes any improvement at all (SF9 or Vegas 10).

So, my question is what's a good procedure for De-essing? Here's a sample clip that was giving me problems (no audio processing applied, just trimmed): Zoom0005-trimmed.mp3 (the problem is with the S-suffix)

TIA,
...Jerry

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9

Comments

farss wrote on 6/21/2011, 6:15 AM
Did you read the instructions that came with the Sptifish?
Can you get the level lights to light up?
It could be that your levels are too low for the detector to "hear" anything.


I've downloaded it but not installed it. Reading the manual you should be able to adjust this FX to the point where it is doing serious damage so if you're not hearing it do anything something basic is wrong.

Bob.
Laurence wrote on 6/21/2011, 7:11 AM
Spitfish is not bad for a free de-esser. You should set it so that it only lights up when you hear a sibilant "s" sound. Then you should be able to dial back the "s" sound as you like.

I use it if I have a voice that is hard to understand, but that when I bring up the high frequencies has too much sibilance. That or people who just have too much sibilance.

In the case of harmonies, usually the sibilance of the main singer is plenty strong enough to be the sibilance for all the singers. As a musician, I am as aware of when the notes stop as I am when they begin, but unfortunately not everyone does this. Harmony singers often end words at different times than the main singer. This is just a little irritating on consonants other than "s", but with "s" sounds it absolutely drives me mad. By running the de-esser on the background singers, you can get rid of the irksome random sibilance at the end of words and it sounds much better.

In the Ted Talk virtual choir video, since the singers were not all in the same room looking at the same conductor, their word endings are all over the place. On the sibilance especially it just sounds horrid. Pretty piece other than that even if the video looked a little dated and tacky.
amendegw wrote on 6/21/2011, 7:38 AM
Well, I'm making a little progress, but still stumped.

First, farss hit the nail on the head when he surmised my levels were too low. That appears to be why I had a hard time getting Spitfish to do anything at all.

However, once I increase my levels, I can get the meter to light up, and all I hear when I press the "listen" button is the ess sound. Is that what I'm supposed to hear? 'cuz when I apply the FX everything is removed except the ess sound i.e. bass ackward. If I unclick the "listen" button and apply the FX - the FX appears to do nothing.

I've read & re-read the manual, but I think I'm missing something very basic.

...Jerry

PS: Doing all this in SF9.

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9

rraud wrote on 6/21/2011, 8:20 AM
Spitfish usually works great.. especially for a free plug-in,
The 'Listen' button is mainly just for reference, for one to hear exactly what is being attenuated. If Spitfish does not work to your liking, a HF band of a multi-band compressor could also be used, if you have one available in Sound Forge for instance,
Opampman wrote on 6/21/2011, 8:51 AM
Rick is correct...there is a pretty good de-esser in Vegas and Sound Forge. Not sure which one it came with because it shows up in both. But it works and is under the Multi-Band Dynamics fx/filter.

Kent
johnmeyer wrote on 6/21/2011, 9:23 AM
In the other thread Laurence said: "Any engineer worth his salt these days knows enough to bake [make] the sibilant consonants line up at the ends of words. Time stretch works well enough that it is just a matter of taking the time to do it ...I was hoping to hear more about this when I saw this thread today. I've occasionally tried to wrangle nasty S's with the tools in SF, but never could make much headway with that tool.

Fortunately, I can do amazing de-essing in Izotope, using Spectral Repair, but it is very labor intensive.

The idea that somehow you could use the time stretch feature to do something useful is intriguing, but I have no clue what that means. I did some searching, and most de-esser technology seems to work by performing compression, but only at a narrow select range of frequencies.

So, how do you use "Time Stretch" to knock down those nasty S's?



amendegw wrote on 6/21/2011, 9:26 AM
Okay, I tried the Multi-Band Dynamics FX & used the "Reduce loud sibilants (de-esser)" as my starting point. Didn't seem to help at all, but I can't overstate my ignorance of what each of the knobs do (but I love to learn!) For instance, I've read thru the documentation on this FX and it uses statements like you can raise or lower the threshold, or you can capture the threshold. Threshold of what? And how would I go about capturing that threshold? I truly am a noob here.

Is it too much to ask for someone to take a look at the following mp3 and tell me what settings should be used to de-ess? Zoom0005-trimmed.mp3 I've been focusing on the shrill ess in the word "unknowns" at the 5.4 sec to 6.2 sec region. Better yet would be the workflow of how the audio is fixed.

...Jerry

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9

Guy S. wrote on 6/21/2011, 11:02 AM
See if version a or b works for you. If you like either version I will share the settings I used. FYI,
I was very aggressive with de-essing, but you could easily back off on the gain reduction for less of an effect.

Version a is de-ess only. Version b has additional processing that I use for my voice-overs.

https://docs.google.com/leaf?id=0B_t3AWQGcoT6MjBjZGRmZjEtZjNhZC00Nzc2LWE5Y2MtYjJhODdhOTJhN2Iy&hl=en_US

Guy
amendegw wrote on 6/21/2011, 11:22 AM
Guy,

First, that you so much for taking the time to do this.

Second, after listening to the clips multiple times, to my ears a & b are mighty close, but I guess I'd choose "a". Both show improvement over the original, but (to my ears) I'm still hearing the essing in the suffixes. Maybe I'm being too picky or maybe the only solution is better micing.

What did you do for your processing?

Thanks again,
...Jerry

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9

Steven Myers wrote on 6/21/2011, 12:06 PM
maybe the only solution is better micing.

That and mic technique.

Laurence wrote on 6/21/2011, 12:21 PM
John, I don't use the time stretch to knock out the S's but rather to get all the consonants happening at the same time. I stretch or shrink the harmonies so that the consonants happen together instead of being staggered all over the place like they are in that Ted Talk piece.
Guy S. wrote on 6/21/2011, 2:25 PM
The Multi-Band Dynamics processor does not remove sibilants, it merely decreases their loudness. To completely get rid of them you'd need to manually edit them out.

This is the processing I used for version A: https://docs.google.com/#folders/folder.0.0B_t3AWQGcoT6MjBjZGRmZjEtZjNhZC00Nzc2LWE5Y2MtYjJhODdhOTJhN2Iy

Guy
amendegw wrote on 6/21/2011, 2:37 PM
"This is the processing I used for version A: Either that URL is in error or I don't know how to get documents from Google Docs. Can I get some help?


...Jerry

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9

Guy S. wrote on 6/21/2011, 2:52 PM
Oooops, my bad... I needed to specifically share the images the images. Try it again: https://docs.google.com/leaf?id=0B_t3AWQGcoT6MjBjZGRmZjEtZjNhZC00Nzc2LWE5Y2MtYjJhODdhOTJhN2Iy&hl=en_US

Guy
farss wrote on 6/21/2011, 3:11 PM
Jerry said:

"all I hear when I press the "listen" button is the ess sound. Is that what I'm supposed to hear? 'cuz when I apply the FX everything is removed except the ess sound i.e. bass ackward. If I unclick the "listen" button and apply the FX - the FX appears to do nothing. "

The Listen button lets you hear what is being removed. You tweak with it On initially to ensure that you are only removing what you want removed. Certainly when you turn the Listen button Off you may not notice any dramatic improvement. Turn it on again and try adjusting the settings to remove more of the Esses without you hearing anything that should not be removed.


As others have hinted at above this should be a good example for you to learn an important lesson about audio. Record it badly and you will have a very difficult time trying to fix it. It is generally impossible to fix bad audio so it sounds as good as well recorded audio. This applies as much regardless of how expensive your gear is. A $10k mic in the wrong place will sound about as bad as a $10 mic in the wrong place, actually the $10K mic could sound worse!

Your mic is picking up these sounds because it is probably too close to the mouth and is on axis to the mouth. Move it back and down a bit, still aimed at the mouth, maybe 30 to 45 deg off axis, you need to experiment. The high energy high frequency "S" sounds are very directional, get the mic out of the line of fire. Learning correct micing techniques is far more rewarding than trying to fix it in post.

The mics Zoom use are "recording" or "studio" mics, they are quite sensitive to what they're picking up. Wind and "plossives" will be recorded and can swamp "wanted" sounds. This is another potential problem you will need to address before these bad sounds reach those mics. Outdoors a dead cat or foam wind stopper is vital. Indoors a "popper stopper" might be a good, cheap investment, you can even make your own "popper stopper for around $2 from a wire coat hanger and a nylon stocking.

Bob.
johnmeyer wrote on 6/21/2011, 3:30 PM
Is it too much to ask for someone to take a look at the following mp3 and tell me what settings should be used to de-ess? Zoom0005-trimmed.mp3 I've been focusing on the shrill ess in the word "unknowns" at the 5.4 sec to 6.2 sec region. Better yet would be the workflow of how the audio is fixed.Here is a link to a "fixed" version. I used Izotope RX2, using the "Attenuate" mode of the Spectral Repair Tool. The key "trick" is to set Direction to "Vertical" and to set the Strength to a very low number.

Zoom0005 De-essed

[edit]I just listened to my fixed version again, and I think if I were to do it again, I wouldn't be quite so aggressive with the setting. I think the result is just a little too soft. One of the amazing things about Izotope RX2 is that you can make almost any sound within an audio file completely disappear. That power is so intriguing, that it is often tempting to just blast the sucker out of existence, just to show that you can.

I'm still waiting and hoping to hear something about how to use Time Stretch to de-ess audio.

amendegw wrote on 6/21/2011, 3:44 PM
farss said: "Your mic is picking up these sounds because it is probably too close to the mouth and is on axis to the mouth"I sure that "hits the nail on the head" I'm need to experiment and practice.

johnmeyer said: "Here is a link to a "fixed" version. I used Izotope RX2... I think the result is just a little too soft"Still, I think these results are quite amazing!! I've been intrigued with Izotrope RX2 because of the postitive reports of its Noise Reduction. This is another reason for me to put this on my "to be purchased" list. Anyone know if a sale is coming up?

Thanks to all for the contributions!!
...Jerry

System Model:     Alienware M18 R1
System:           Windows 11 Pro
Processor:        13th Gen Intel(R) Core(TM) i9-13980HX, 2200 Mhz, 24 Core(s), 32 Logical Processor(s)

Installed Memory: 64.0 GB
Display Adapter:  NVIDIA GeForce RTX 4090 Laptop GPU (16GB), Nvidia Studio Driver 566.14 Nov 2024
Overclock Off

Display:          1920x1200 240 hertz
Storage (8TB Total):
    OS Drive:       NVMe KIOXIA 4096GB
        Data Drive:     NVMe Samsung SSD 990 PRO 4TB
        Data Drive:     Glyph Blackbox Pro 14TB

Vegas Pro 22 Build 239

Cameras:
Canon R5 Mark II
Canon R3
Sony A9