Recording from SPDIF

ibliss wrote on 1/23/2004, 7:39 AM

"sync source in the recording software is set to external, cannot see where to set that?"

Any digital playback device has a clock running at a frequency corresponding to the sample rate such as 44.1kHz (CD players, Minidisc).

When you record from one digital device to another (lets say in this case the CD player to the 410), the clocks of the two devices must be the same or the recording won't work properly - you will end up with a less than perfect copy.
Imagine an elevator - the car is the source (CD) and the hallway is the destination (410), the doors are the digital clock and the people the digital information. If the elevator and hallway doors do not line up (ie digital clocks are out of sync) and the car stops 1/3 short of the doors, then you lose 1/3 of the people coming out of the lift (and get sued).

Sorry for that stupid analogy....

anyway, when you are simply playing back audio from your PC, the 410 should be running to it's own internal clock.

When the 410 is recording from the digital input, it should be running to the digital clock from the source device. You don't need extra cables, as the clock signal is embeded in the audio stream. But you may well need to tell the 410 to sync to the external source, and there is an option in the 410's volume control panel (the one sitting in the system tray).

Open up the M-Audio controal panel and click on the 'hardware' page.

to quote the manual:
"SYNC SOURCE – This field allows you to choose between the FireWire 410’s INTERNAL clock and an EXTERNAL clock source. INTERNAL selects
the incoming clock from the 1394 bus as set by your audio software. Use EXTERNAL when you wish to record from the S/PDIF inputs, or
otherwise lock to the sample rate from an external S/PDIF source. "

This should sort you out.

also check this:
"S/PDIF INPUT – This field selects between TOSLink optical and RCA coaxial input source. Note that both optical and coaxial outputs are
available simultaneously, but only one input source may be used at a time."

It's posible that your playback device only outputs a clock signal when it is playing, in which case every time you stop it the 410 loses the external clock and Vegas then can't record because there is no valid digital signal. If this is the case (though it seems unlikely) try leaving the source in pause mode rather than stopping it.

Hope some of this helps.

farss wrote on 1/23/2004, 2:29 PM

ibliss,
I've had 30 years as an electronics engineering so clocking data I understand only tooo well, having loss of the clock hang the system kind of rattled me though. I think its under control, I was finally able to record the first tape (only 23 to go).
As I understand it what I've recorded is an exact digital copy of what's on the tape. Now the levels on the tape seem woefully low, both Vegas and the meters on the DAT peak to about -12dB. By my calculations this means the data on the tape only has around 12 bits of resolution which isn't that great. I've tried just using Normalise in Vegas but the audio then seems to acquire a degree of digital harshness to it. That makes perfect sense to me, I'm no audio guru but I do understand digital signal processing.
Anyway I would image that before I convert to mp3 or even before putting this material onto CD I should get the level up to a more sensible level. I understand that to do this and get as clean a sound as possible I need some form of dithering. I have both Vegas and SF7, any tools in either of these I can use to bring the level up and apply dithering in the process?

This is only spoken word and much of it recorded a long time ago, so I'm not expecting a sonic masterpiece by any means but I'd still like it to sound as clean as possible.

By the way,
my deepest thanks for the help.

ibliss wrote on 1/23/2004, 3:07 PM

Using Sony audio plugins:

As you have Soundforge, it's always worth using the DC offset tool as a base point.

If you have the Noise reduction plugin, then this may be useful in removing some background noise - make this the next step.

If EQ is needed, set it as desired then move onto the next step.

Don't bother normalizing - use the Wave Hammer plug. Use the 'voice' preset as a starting point. I would set the compressor so that you are getting roughly -3db - 4db reduction on peaks. Adjust the Threshold control until you are seeing this.
Next click on the 'Volume maximiser' tab. Set the output fader to -0.3dB. Then lower the threshold until to acheive the required level increase/sound quality compromise (in that you will get to the point where the audio sounds way too squashed/distorted etc - this is when you raise the Threshold again!) that suits the material.

This should be the last thing you do to the audio.

Using longer look ahead on both the compressor and volume maximiser is a good idea.

Rednroll wrote on 1/23/2004, 3:48 PM

I'm not quite sure I'm following the problem, so please excuse me if I am misunderstanding. The program material is only peaking at -12dB, and basically you want to raise this level to near 0dB with less digital artifacts? Normalization seems to be the best tool to me. You state from your calculations that only 12 bits are being used from the initial recording. From my calculation 14 bits would be being used, thus 1 bit per 6dB of level if it's originally peaking at -12dB. At this level of processing, I'm having a hard time understanding why you are hearing digital artifacts, unless only a few of the peaks are reaching -12dB and the majority of the material is much less in level.

You might want to try recording the audio off of the DAT player using the Analog outputs and raising the gain +10dB to +12dB. This will give you your full 16 bit resolution, but will have side effects of a raised noise floor due to the increased gain stage of the DAT player, and also due to the additional D/A and A/D conversions. I'm not quited understanding the recommendation of using the volume maximizer. The volume maximizer will add digital aliasing just the same as normalization does, but would probably make it more noticeable because you are taking that digital artifact and compressing it, and thus increasing the level of it.

Another thing you might want to try is recording it digitally and then using the resample tool to convert to 24bit or 32bit resolution before you do the normalization process. Although, it seems to me that Sound Forge might be doing this for you already if you have the 32bit temp file enabled in the preferences.

I agree with the word clock advice, and it doesn't sound like you previously knew how to change your sound cards word clock to external when recording from the DAT. This is most likely the source of the digital artifacts, rather than the bit resolution. Or their is a possibility that the DAT player used to record the original audio didn't have a very high quality A/D converter, and when you do the normalization within Vegas it's making this bad A/D conversion more obvious to your ears. Another thing I would try is to do peak normalization within Sound Forge. I have been more pleased with the sound quality doing this offline in Sound Forge, than using the normalization switch in Vegas.

Just my 2 pesos,
Red

farss wrote on 1/23/2004, 6:04 PM

Thanks heaps guys,
some good input there.
As this material is absically already mastered and all I'm doing is converting it to LOW bit rate mp3 and mastering for CD I don't want to play around with it anymore than necessary. I'm bringing it in digitally so I don't do anything to the noise level. What's a small pain is some of the tapes are 44.1 and others 48KHz but I'll work through that.
I soon sorted out the external clock sync issue, it seems M-Audio are having a bit of an issue iwth the clock causing things to go screwy if it goes away, mostly with Macs, but the latest drivers seemed to have fixed that.

I think I'll try upsampling it to 24/96 in SF and then normalise and see how it sounds, I'd just like it as free of artifacts as possible as I'm sure the mp3 compression will only make things worse.

BTW, Red isn't 1 bit half the power which is 3 dB, 6 dB is half the VOLTAGE which equates to a quarter of the power. Thats how I arrived at the figure of 12 bits. I remember this as being a major source of confusion.

Rednroll wrote on 1/24/2004, 11:43 AM

"BTW, Red isn't 1 bit half the power which is 3 dB, 6 dB is half the VOLTAGE which equates to a quarter of the power. Thats how I arrived at the figure of 12 bits. I remember this as being a major source of confusion."

Let me clear up the confusion once and for all :-)

CD's have 96dB of dynamics range, which equates to 6x16=96. In other words, 6dB times 16bits, thus 6 dB per bit. Another thing wrong with your calculation is that you are talking about 6dB as it appears on a peak meter. Power is not a peak calculation it is an RMS calculation. You're right though that 6dB is half the voltage and quarter of the power by the equations.
1. dB=10*Log P1/P2
2. dB=20*Log V1/V2

Here's another way you can prove this to yourself if you don't trust my math. You will need Sound Forge or Cool Edit Pro which have a statistics feature.

Create a 16bit/44.1Khz, 1khz at 0dB sinewave. Thus this is a 16 bit waveform with the maximum peak happening at 0dB. Select the waveform and goto Statistics. You will see you get a maximum sample value of 32,767(0dB). You need to consider the samples are in 2's compliment form. So if you do that math you get 2^16=65,536 possible sample values. Now divide by 2 because it's 2's compliment representation and you get 32768. So you need the entire 16 bits to represent this number.

Now let's do the same thing for a -12 dB peaking sinewave and assume it is using 14 bits like I proclaim. Create a 16bit/44.1Khz, 1Khz at -12dB sinewave.
Check the Statistics in Sound Forge. Maximum Sample Value =8,230 (-12dB). Now the math. 2^14=16,384. Divide by 2 for the 2's compliment representation=8,192. Pretty damn close to what Sound Forge is saying it is. Do the math for your 12 bit explanation and you see how far you are off.
2^12=4096, divide by 2=2048.

Did you get all that?

farss wrote on 1/24/2004, 2:09 PM

Red,
after much soul and brain searching I realised that I had fallen into the trap that I'd thought you had!
I'd gone...OK 1 bit is half the voltage and that's 3dB whereas the correct answer is 6dB. That means it's 6dB per bit so -12dB would be 16 - 2 = 14 bits resolution.

I learnt all this stuff when we sent "music" down telephone lines. I used to carry a real VU meter in a leather pouch to check levels on the telcos lines. What really confused things was the telco changed from 600 ohm lines to 150 ohm lines for better HF response but our meters were calibtrated for 600 ohm lines so you always had to do a bit of quick mental arithmetic.
Man those were the days, 100V distribution amplifiers with a 100Hz to 10Khz response and anything better than 1% THD was considered good. You'll probably work out that I used to work for MUZAK! I'll understand if you never talk to me again. I guess it kept a few musicians employed but it left me with a life long hate of any music playing in the background while I'm working. Would you believe they pressed some of their stuff onto vinyl to give to clients at Xmas, what were they thinking I wonder, gee, "Merry Xmas, now your home can sound like an elevator" or what.

BTW getting back to my original problem. You guessed right . The harshness I was hearing is in the original recording. Turning the volume up just made it more noticable. If I plugged the cans into the DAT and wound the phones volume up on that, its just as bad.
This material would have come off 1/4'' and who knows how it was originally digitized, probably into a Fairlight system and then probably an analogue copy from that into the DAT.
Also some of the old folks on the recording have that sort of voice that has a certain harshness to it to start with.

Rednroll wrote on 1/24/2004, 7:33 PM

"after much soul and brain searching I realised that I had fallen into the trap that I'd thought you had!"

I find that a lot, that sometimes when having too much experience and knowledge in a subject, you can over analyze things and confuse yourself. Then you figure it out, smack yourself in the head a say "doooh!!, what an idiot I was".

Farss,
With your experience you might be able to help me out. Internally at the company I work at I've been having a debate of what dB level change is considered as "Twice as Loud". On our company websites I saw a lot of postings of what I considered to be incorrect, which then started the debate. I don't want to tell you which is my position, but I have some pretty solid information backing my position. The other viewpoint has some facts backing their opinion, but haven't fully convinced me to change my viewpoint, because their reference wasn't based on music studies. Their viewpoint is actually published a lot in audio books and all over the internet, which I consider to be wrong. So without too much further information, what dB change in level to you consider to be "Twice as Loud"? +6dB? or +10dB?

jester700 wrote on 1/24/2004, 7:51 PM

Historically, it was one bel. That's 10 dB

farss wrote on 1/24/2004, 8:13 PM

Red,
that's a mighty subjective question, it's up there with "what is twice as bright". I seem to recall the figure of 10dB being considered as sounding twice as loud but I seriously doubt if there was any real subjective testing conducted to justify that position.
I wouldn't even know how you go about running such a test given the way humans respond to sound. Much the same applies to perception of frequency response and distortion. I do know that 3dB is considered the smallest percievable change in level.

I was lucky many years ago to attend a talk by Prof. Bose. One of my favourite anecdotes of his was how he was trying to evaluate what subjects thought of sound at different frequency responses and varying amounts of distortion. He invariably found they preferred settings that made the audio sound like what came out of the mantle radio of the day.

Rednroll wrote on 1/25/2004, 6:28 AM

"Historically, it was one bel. That's 10 dB"

Historically, "one decibel", was the smallest change in level that a human could hear under ideal conditions. So how would 10dB then make it "twice as loud"? I'm really looking for some solid information with backing studies to support them. You can read in almost every audio text book that 10dB is twice as loud, but that doesn't make it necessarily correct and I'll put up another post supporting that issue and further background information saying 6dB is twice as loud.

farss wrote on 1/25/2004, 6:49 AM

Red,
you might find this of some interest:

http://www.phys.unsw.edu.au/~jw/dB.html

Have a look down the bottom under phons and sones

The scaling system is based on 10dB being a percieved doubling of loudness.

Of course this may be based on unverified 'knowledge'

Rednroll wrote on 1/25/2004, 7:15 AM

Here's some facts about the 6dB or 10dB debate to chew on. I'm on the 6dB side if that isn't obvious already.

Does Pressure and Voltage Directly Relate To Volume? For all practical purposes, yes. Although the ear is not exactly pressure sensitive, it is closer to being pressure sensitive than to being anything else. Many studies have been done regarding how changes in perceived loudness (volume) relate to level changes in dB. All of these studies suffered from problems in getting exact figures in that the personal opinion of listeners had to be consulted to get data for the studies. The results of the studies did show that the perceived change in loudness varied greatly (by some 30% difference) depending on the starting volume, the frequency of the sound and the complexity of the wave. [Howard W. Tremaine, The Audio Cyclopedia, Second Edition, pages 17-18] Furthermore, the greenest recording student listening to music played through a console with a meter can quickly discover that an increase in level of a certain number of dB is much easier to hear than a reduction of level by the same amount of dB. Someone (or some people) interpreted test data and made a generalization that a 10 dB change in level was twice (or half) the volume and many texts compound this useless and deceiving assertion. In practice the 6 dB change for full fidelity music represents twice (or half) the volume better than the confusing 10 dB. It has good scientific basis in that twice the pressure is an increase in 6 dB. The following is presented as factors supporting this: 1. For centuries, composers and conductors have used a formula that it takes four times the musicians to get twice the volume. If a composer/conductor wanted the violins to be twice as loud, they would specify 4 times as many. This is four times the power or a 6 dB volume increase. 2. If 10 dB is twice the volume, then 10 people talking (10 times the power) would be twice as loud as one person talking. Any day care worker can tell you that 10 kids are much louder than twice the volume of one kid. If 6 dB is twice the volume, four kids would be twice the volume of one kid; you might have a chance of a day care worker agreeing with you on this.

Further information why historically 10 dB may be wrong:

It is common in language for the manner that the word is used in society to be the final determining standard for what a word means and technical terms are no exception. When a technical term is being commonly used by professionals differently than the dictionaries say it should be, it is time for the dictionaries to be re-written. When you were little, it's possible that your teacher would not allow you to use the word "can" in asking a question. She may not have let you leave the room until you said, "May I go to the bathroom?" She was simply trying to get you to use words correctly. Most dictionaries of today allow you to use the word "can" when asking a question - it is too common in society for the writers of dictionaries to ignore. The same kind of thing has happened to the decibel.

When the original tests on this were made, it was the 1930's. They were reported in the first edition of the Audio Cyclopedia (c. 1936).
In the 30's you went to concerts and got recordings with the voice way up over the music (like 10 dB!). The ears were in compression in the midrange due to the loudness of the voice, and therefore you would need to have 10 dB reduction in the input level (sound pressure going into the ear) to get half the loudness. But when you played it for a while at the reduced volume, the ear would go into compression as you brought up the volume so 6 dB didn't sound twice as loud.

Now enter the 1950's where rock & pop had the voice much closer to the music track. Here 6 dB increases soundsed more like 6 dB increases, more like "twice as loud."

I truly like the daycare analogy, this gives every listener something you can relate too. I'ld like to further explain that analogy with the math. People are analagous with Power. So you use the equation: dB=10*Log P2/P1. So if you have 10 times the amount of people as you originally had you get: dB=10*Log 10/1= +10dB. So if you agree that 10 dB is "twice as loud", you are saying that it takes 10 people talking to only be twice as loud as one person talking. Now for the 4 people being twice as loud the math becomes: dB=10*Log 4/1= +6dB. I think 4 people talking at the same time will tend to be more close to "twice as Loud" as 1 person talking.

Rednroll wrote on 1/25/2004, 7:23 AM

Red,
you might find this of some interest:
http://www.phys.unsw.edu.au/~jw/dB.html
Have a look down the bottom under phons and sones
The scaling system is based on 10dB being a percieved doubling of loudness.
Of course this may be based on unverified 'knowledge'"

Without even looking at this yet, I can tell you it is probably based off of the Fletcher-Munson equal loudness curves. Fletcher-Munson did their studies using sinewave tones to show the frequency response of the human ear at different levels. This is nothing comparitive to music, which has many complex sinewaves all playing at the same time. They also have done studies using flat whitenoise. Well anyone who has looked at music under a spectral graph can tell you music is not flat, and Fletcher-Munson already proved that your ear response is not flat. Infact, I've noticed when looking at music spectral curves, it's an almost inverse of a Fletcher-Munson equal loudness curves. I believe this is part of the problem. People look at the Fletcher-Munson curves and say "Yep, it's been proven 10dB is twice as loud, and go on spreading the disease." They don't consider the source of the studies though, and apply it music.

farss wrote on 1/25/2004, 2:06 PM

Red,
I'd agree with what's being said here. Before I read it i went out for a smoko and I was thinking "OK these test were done when the main interest was telephony and AM radio". The music of those days had a much dofferent spectral composition than today as well.
But these are very subjective evaluations, like I said before it's like what is twice as bright or twice as sour. Our sensors and brains are not mathmatical in nature, I'd suggest if you did subjective tests with a group of people listening to music the results would be skewed by their preference for the music.
The only work I've done with people and SPL was in hearing protection and that's kind of relevant in a way. I think you'd agree you can make something sound a lot quiter just by droping of the bottom end. The trap for humans is that they think it isn't that loud because their guts aren't being shaken yet they can be suffering serious hearing damage.

Rednroll wrote on 1/25/2004, 6:24 PM

I agree, it is subjective in a lot of ways, but this little fact is somewhat important to me. In my job I have to explain to a lot of engineers who may not be all that audio suave. So when I give them an explaination with a decibel value, this always doesn't really mean anything until I break it down into being twice as loud, or just over twice as loud and such. So having a common agreement as to what is twice as loud becomes important. I've actually decided I'm going to do my own study using various music types with different spectral content. I'm willing to change my viewpoint of the 6dB being twice as loud, but I have yet for anyone to come with anything concrete on the subject of 10dB being twice as loud. As my previous post describes, the ear is hard to quantify exactly how it perceives loudness, and the closest thing that can quantify it is that it is sound pressure sensitive. Seeing the +6dBspl is an increase of twice the sound pressure level, then this to me is quantitative enough to say 6dB is twice the volume.