Audio: Theory Vs. Practice

wolfbass wrote on 9/21/2004, 12:47 AM
The situation.

I have voice recorded over the top of music playing in the back ground. I also have a copy of the music.

Can I load the two tracks up, reverse the phase in the music only channel, and therefore hear ONLY the voice?

I know in theory it should work, has anybody got any comment or real life experiences regarding this?

TIA.

Andy

Comments

Grazie wrote on 9/21/2004, 12:59 AM
. .hmm . .interesting . . g
TorS wrote on 9/21/2004, 12:59 AM
In theory, this theory will work better in some cases than in others. The background music will have a lot of accoustics that the canned music does not have. How do you plan to get rid of that?
The voice recording must be very good.
It will put very high demands on your sync. Also - when you try it, make sure everything is mono.
You may come to a point where you have a result that you're not entirely happy with. Then you may try to camouflage the rest of the background noise with something else.
Tor
farss wrote on 9/21/2004, 3:02 AM
Technically it might be doable but there's a few hurdles:

1) The Eq on the recorded music is going to be different to that coming off the CD. The tone controls and the speakers will have a serious impact.
2) The room acoustics provide not only reverb but possibly some degree of cancellation at certain frequencies.
3) If objects (people etc) move between the camera and the sound source they'll also influence what the camera recorded and if the camera was moved oh boy.

So here's how I'd trt before goign to the pub.

Starting with the canned music, apply Eq to get it to sound like the live music. The spectrum analyser in SF might come in very handy.
Then I'd apply say multitapped delay to try to repo the room acoustics. Once you've got the canned version sounding like the recorded one then I'd try the invert and subtract. Start by visually aligning the waveforms with QTF off and then slide one track ever so little. As you get close you'll hear flanging so you know your close.
When you've got it as good as you can take a portion with nothing in it that you want render the combined signal out and analyse that in SF to try to see from what's left what is wrong with what you're subtracting and adjust and repeat.

Of course Hollywood would fix this with ADR which might be simpler if you can get the guy to say the words again!

Bob.
riredale wrote on 9/21/2004, 7:51 AM
Based on my own experiments, I would have to conclude that it can't be done, simply because the two audio tracks are far from identical. BUT if you could exactly reproduce the original environment and re-record the audio (minus the speaker, of course) then you might have a chance. But, again, you would have to make things exactly as they were, including camera position.

Even then you will need to be able to sync up the two tracks to individual sample precision, or very close to it, or you will get some very unpleasant flanging effects (I think that's what the effects are called, anyway).

I think maybe lip-syncing in post makes a lot more sense.
John_Cline wrote on 9/21/2004, 8:04 AM
No, this is simply impossible. Don't waste any time trying to make it work.

John
Former user wrote on 9/21/2004, 8:06 AM
I have often asked the same question about 60cycle hum, and have been told by people who would know, that it won't work.

I say, if you have the time, try it. Play with eq and see how close it works.

Dave T2
busterkeaton wrote on 9/21/2004, 8:41 AM
I think there's another issue too. Wouldn't some of the frequencies of the song be in the voice too, and therefore you would be removing some of the voice as well?
Spot|DSE wrote on 9/21/2004, 8:54 AM
Amen.
Can't be done.
John_Cline wrote on 9/21/2004, 9:49 AM
60 cycle "hum" is a different animal than trying to remove (or cancel) broadband audio, like music. It is at a constant frequency and usually at a constant level. Actually, 60hz hum is at 60hz as well as a few harmonic frequencies above that. (120hz, 180hz, 240hz, 300hz...) Hum is pretty easy to notch out using very narrow EQ. The Waves X-Hum plugin is stellar at this. Adobe Audition has a decent hum removal plugin as well. 60hz "buzz" is more difficult because it has way more 60hz-related harmonics. That becomes a job for an FFT-based noise reduction plugin like Sony NR2.

John
Rednroll wrote on 9/21/2004, 10:58 AM
I disagree, I think it can be done with satisfactory results. The thing you need to make sure of is that the music without the voice has to identically match up with the music that is mixed with the voice. For clarities sake, I will call the music with the Voice "Track1" and the music without the voice "Track2".

I've done this in the past where an artist made a full mix and an instrumental mix, but not an acapella and wanted to go back and create an acapella. In this instance both tracks where laid off to DAT player, and the only difference between the 2 mixes is that the instrumental had the vocal cut on the mixing board. So if you have a simular situation, then you should be able to get some satisfactory results.

You do need to make sure of a few things, which will be the difficult part in getting this to work, but in my above example most of these differences I knew didn't occur because I was involved in the mixing process.

1. The amplitude of the 2 music parts must be identical. If you can find a section in Track1 where the voice is silent in the mixed signal, then you need to zoom in and line Track2 up with that part. Find out what the amplitude difference is between Track1 music section and Track 2. Apply the gain difference to Track2. At this point your amplitudes of both music parts should be identical.

2. The EQ processing must be identical. If there was EQ applied to Track 1 during mix down, which there probably was to make the voice fit better, then this task becomes exponentially more difficult. You will need to apply the same EQ to Track 2. This will take a good ear, and the use of spectral analysis comparisons. You might get lucky and no EQ differences were applied. Try reversing the phase of Track 2 and mixing it at withTrack1 and listen to your results after ensuring the recommendations in Step1. The left over music will probably be the EQ differences....which actually could help you figure out what kind of EQ was applied in the mix.

3. The Reverb must be identical in the music parts. You will need to listen to Track 1 to sections where no voice is present again and listen to see if any reverb was added to the music. If there was reverb added....I would quit at this point and say this is an impossible task, it is highly doubtable you will find the exact same type of reverb applied.

4. Track 1 and Track 2 music must be syncable. If there was any transfers that caused the Track 1 to possible vary in speed, or if any time compression or tempo variation was done to the music in Track 1, then forget it. You should be able to tell this when you sync them up in step 1 above.

The math behind the "theory" is very simple. Track1=music+voice, Track2=music. If you reverse the phase on Track 2 and added (ie mix) it to Track1 then the math becomes
Music+Voice-Music=Voice

In practice to make the theory work the music must be equal in all aspects, that will be the difficult part, but not impossible. Also in practice I've used this scenario in a couple different realworld experiences, when I didn't have a total recall mix available, but had the instrumental and instrumental+voice mix. I had one where they said the voice wasn't loud enough, so I reversed the phase of the instrumental and blended it into the instrumental+voice mix and by raising the volume of the instrumental it would lower the music level. Then on the opposite side of the coin, I had the situation where they said the voice was too loud. So I sank up the music with the mix and raised the level of the music track until they where happy with the voice to music level.


Red
Erk wrote on 9/21/2004, 11:11 AM
farrs,

What's ADR?

Greg
Former user wrote on 9/21/2004, 11:17 AM
Additional Dialogue Replacement

or

Automatic Dialogue Replacement

Basically, people redoing the VO.

Dave T2
Spot|DSE wrote on 9/21/2004, 11:20 AM
So lemme get this straight....
Live performance in a live room....
CD of different performance/same song.....
Applying CD against room, in reverse phase, gonna kill the room noise?
I gotta hear this.
Rednroll wrote on 9/21/2004, 11:25 AM
Spot,

Did you read something I didn't? Of course that scenario wouldn't work, because music doesn't equal music in your scenario.

Here's what I read from the original post and the information I'm going off of. I didn't see any further posts from the original poster, so either your information is presumed or you know something further that I don't.

"I have voice recorded over the top of music playing in the back ground. I also have a copy of the music."

So I've got to hear where you got the information of "Live performance in a live room....
CD of different performance/same song...."

If you can point me to where you read that information, then I will totally agree it's "impossible".
Spot|DSE wrote on 9/21/2004, 1:17 PM
Nope, after re-reading only the original post...you're right. I got cuaght up in the other stuff.
Rednroll wrote on 9/21/2004, 1:24 PM
Does it remind you of the time in elementary school when you sat in class in a big circle and the teacher whispered something into the first students ear and then the message got passed around the circle? :-) Funny how that still holds true today huh?
rmack350 wrote on 9/21/2004, 2:09 PM
I don't think it's a matter of misreading the initial post. Just interpretation. From the post I imagined an interview in a place that had music playing in the background. You won't phase-cancel that out.

What RednRoll is picturing is a mixdown of voice and music. In that case you could copy the audio to a new track, invert the phase and make sure everything is lined up. Then it works. I just tried it. But if that was the problem then WolfBass could just turn off the background track and rerender.

Rob Mack
winrockpost wrote on 9/21/2004, 2:19 PM
works, just tried it.
Rednroll wrote on 9/21/2004, 3:19 PM
"But if that was the problem then WolfBass could just turn off the background track and rerender"

Well, you're also presuming that he has the background tracks but what he said was, "I have voice recorded over the top of music playing in the back ground. I also have a copy of the music." Nowhere did he say he has the background tracks seperate from each other. Exactly the same scenario which I outlined in my first posts example that I've done in the past. When you say you have a "copy" of the music and you're working in the digital world, then copies are usually identical, that's what my advice is presumming. I may be wrong with my presumption, but I'm only working off the information that was given. You may be 100% correct with your interpretation, we won't know unless Wolfbass replies with further details of what both source materials are.

He further asked, "has anybody got any comment or real life experiences regarding this?"

Yes, I do and that's what my advice outlined on how to do it and the boundries at which it can be done.
farss wrote on 9/21/2004, 4:01 PM
Red et al,
as we haven't heard back from Wolf it's all a bit of idle speculation. But I'm assuming the problem gos something like this.
Say at a wedding, someone is making a speech but the DJ is playing some music at the same time. So what Wolf's got it music mixed with speech but both affected by room acoustics and the music will have who knows what happened to it.
Now he's got clean music from the CD the DJ was playing but trying to recreate that to match what was recorded at the same time and in the same track as the speech is going to be very tricky. As I said before technically it could be done, practicaly as I've never tried I'd imagine it could get very messy to the point of impossible.

About to try something similar but I (as did you) have clean music on both tracks so I've got a chance. Still have to match up levels and maybe Eq but at least I shouldn't have phase that's wondering all over the place.

Bob.
Rednroll wrote on 9/21/2004, 4:22 PM
Yes, you're speculation is pretty much similar to Spot's and I agree in that scenario it will be virtually impossible, actually I'll go further to say that it is impossible. Besides the ambience differences you have a constantly changing ambience in your scenario by people moving around causing different reflections and absorbtions. Not to mention a different playback speed of the music, because all CD players will vary in playback speed and you will therefore get phase allignment problems.

In my experience of music and commericial production here's a couple scenario's that could also be going on. In commercials it is a general practice for the studio that created the commercial to layoff splits of the mix and lay them off to DAT or CDR. This way the commercial can be transferred between different studios without having to worry about system compatability to do further work at a different studio. It is common to lay off seperate tracks of 1. Full Mix 2. Sound FX mix and 3. Music mix. So in my scenario they could have the Full mix and Music Mix tracks and be able to recover the voice on top of the mix using the method I described.

Scenario 2 is they could have a music CD that has a full mix and an instrumental mix. This is another common practice by artists who release CD singles, and they are virtually trying to ask the common question, "can I extract the voice from the music?" In this scenario, the answer is "Yes" because you virtually have an identical music bed to subtract from the full mix. In my personal experience I had an artist that had a final mix of a song and an instrumental of that song, which they used to sing/rap over for live performances. They didn't have an acapella mix, and didn't have access to the original mix to do a recall so they could just solo the voice. They came to me and wanted an acapella which I told them I could do for them as long as they had the full mix, and an instrumental mix where the music was identical.

Who knows what the real scenario is, but as I mentioned before I do disagree with the posts that say it's "impossible", especially with the information that's been given. It is possible with the right elements, I've done it and others are showing they're able to do it also. It seems like the original post had done some research on the "theory", so to me this means they probably have some background information from some reading they've done and is why I had disregarded your scenario. Maybe they really don't even have any of the parts they originally stated and where just asking the question to see if it's possible in realworld "Practice". Well, I'll stick to my post then and give them the correct answer, which is in practice, yes it is possible. I'm willing to bet the Sony Noise reduction works on the same premisis as my original advice and is presented to you in a nice GUI, with a noise print capture, and a noise reduction amount slider.
Rednroll wrote on 9/21/2004, 5:35 PM
Now on the lighter side, I wish you could do similar tasks with video. If this would be possible I think Sony would have implemented my garment remover video plugin I suggested back in Vegas Video v2.0. Could you imagine? You take a Britney Spears or J-Lo music video, then you go out and purchase the same clothes they're wearing in the video and then phase reverse the clothes you purchased and blend/mix them with the original video, and then you are left with Britney and J-Lo dancing around naked on the screen.

Now I could make some real money with a feature like that.
I can only dream. :-)
farss wrote on 9/21/2004, 7:10 PM
Nice try Red,
but all joking aside there is an invert FX for video and doing much the same trick as you would for audio can be very useful.
I've used it a couple of times when I'm trying to line up video to pixel accuracy. Once you've got everything aligned you turn it off naturaly but as it lets you see a fringing effect when somethings out of alignment it's quite handy.

Bob.
wolfbass wrote on 9/21/2004, 7:22 PM
Spot: You assumed correctly - It was a live situation. Rednroll: Sorry, not enough information on my part.

All: Thanks for the tips as usual esp John Cline :)

I'll try a different path! :)

Andy