32 bit floating point pixel format

drmathprog wrote on 1/31/2008, 5:54 AM
I've read Glenn Chan's article explaining that certain Vegas filters can yield different results under 32 bit floating point vs. 8 bit pixel format. That's certainly been my experience so far; in fact, some of the filters seem not even to work under 32 bit. My question is, is this understandable and correct behaviour, or are there in fact bugs? Coming from the audio world, higher resolution audio can preserve audio details and fidelity better than lower resolution, but in my expereice the outcomes are close. In my limited experience with Vegas plug ins and 8 bit vs. 32 bit float, the results are nearly always significantly different, and often widely, outrageously different. This seems illogical.

Comments

Bill Ravens wrote on 1/31/2008, 6:32 AM
drmathprog...

your name implies you know something about math. you should know, then, that 32 bit floating point math requires different algorithms than 8-bit precision. I think it's safe to say that the Vegas filters that don't work with 32-bit float are simply filters that the Vegas development team haven't modified to work with 32 bit float, yet. Give them time. Seems perfectly logical to me.

For anyone not really getting what 32-bit float is about, read this thread, in particular, the posting by John Miller...
http://www.dvinfo.net/conf/showthread.php?t=103298&page=2
GlennChan wrote on 1/31/2008, 10:49 AM
It comes down to how Vegas is designed. Flipping between 8-bit and 32-bit changes a number of behaviours. So do changing the compositing gamma from 2.222 to 1.000.

2- Compositing gamma:
Vegas can send either gamma corrected values to filters, or linear light values.
Personally I think Vegas should *always* send gamma corrected values to filters (this would be a lot less confusing)... but it doesn't do this.

3- If you want to do processing in linear light, it's more convenient (though not always necessary) to convert all values to a range with black level at 0. So this is why codec behaviour changes in 32-bit mode... HDV for example will decode to a 0.0f-1.0f range (in floating point the range is 0-1), which looks like computer RGB to the user. As the user, just pretend that these codecs convert to computer RGB.

However, Vegas is confusing in that not all codecs decode to studio RGB range in 32-bit mode.

4- At the end of the day, it is possible to get the code to effectively do the same operations except at higher precision. Vegas doesn't do this by default. But by sandwiching filters with the right levels conversions (studio <--> computer RGB) and gamma/linear conversions (between gamma corrected <--> linear light)... you can get Vegas to behave this way.

5- It's not so much the math of it as the design of Vegas. Personally I think that the design is unintuitive. On top of that, you have to manually wrangle all your levels and color space conversions (which requires you to know what color space conversions you have to make).

5b- It's not intended that you switch between 8-bit and 32-bit. (Even though the end user might do that for better performance before final rendering.)

5c- In other programs, color space conversions are handled for you. After Effects handles all this stuff for you; however with Affect Effects, you have to deal with Quicktime color mismanagement and manually deal with codecs that decode to studio RGB levels.

6- To simplify your life...

Avoid the 1.000 compositing mode. It will be less confusing and you'll generally get better performance.

If you must use 32-bit mode, make sure that all your levels are correct. You'll have to manually perform any necessary conversions between studio and computer RGB. Unfortunately you have to consult the table in my article.

Convert all formats to a common space... I would recommend converting everything to computer RGB if working in 32-bit mode. So for any studio RGB codec, apply a "studio RGB to computer RGB" color corrector preset.

When rendering out your project... you may need to do another conversion here. If the codec you are encoding to wants to see studio RGB (but everything in your project has computer RGB levels), then you have a lot of choices:
A- You can nest your project into a new .veg. In the new .veg, apply a "computer RGB to studio RGB" color corrector preset to the clip or video preview FX.
B- Don't nest. Apply that preset on the video preview FX level. This can be dangerous if you forget to take it out afterwards.
C- Render your project to a new track, using the Cineform codec. The resulting file will have non-standard levels, but that won't be a problem for us in this case (it might if you were to use that file outside Vegas). From the Cineform codec, render again into your target format/codec. If that codec wants to see studio RGB levels, then you'll need to convert your levels from computer RGB to studio RGB.

In practice, rendering in 32-bit can be very slow. So that might be why you do C; with that single render you can render all the versions you have to make.

Whew...
drmathprog wrote on 1/31/2008, 11:37 AM
"your name implies you know something about math". Thanks Bill, my dissertation adviser would be thrilled to hear that! :-)

I feel like I'm wading into the big muddy here. All I started out trying to do is convert some 1980's family video to DVD to preserve them. On several tapes, the VHS images are faded and/or colour shifted. From my experience in digital audio, I thought it might be helpful to process in 32 bit float, since these clips require a fair amount of plug in processing. In digital audio, processing in higher resolution and down-dithering at the end is often helpful. What is going on with video in Vegas with respect to 8 bit vs. 32 bit float video seems fundamentally different, but maybe it just seems that way to a video novice like me.
Bill Ravens wrote on 1/31/2008, 12:13 PM
The reference you cite for audio is the BITRATE that audio is converted from analog to digital. It is the rate at which the analog audio signal is sampled and converted to digital. Yes, when you downconvert 24 bit audio, for example, to 16 bit, dithering is needed to help with roundoff errors.

Bitrate is very different from the number of bits used in a digital calculation. I suppose 8-bit video is somewhat of a misleading term, since the mathematical precision of an 8-bit calculation is probably more like 16 bits of precision in fixed point calculations, which is what 32 bit windows uses, natively. 32 bit float refers, more accurately, to 32-bit mathematical precision with floating point calculations, rather than fixed point. As I understand it, floating point doesn't require dithering, however, crunching a 32-bit floating point number in Windows 32 is substantially harder for an Intel 32-bit processor than crunching a 16-bit number.

Bits/Bytes....it all gets so so ambiguous, doesn't it?
farss wrote on 1/31/2008, 1:00 PM
Actually bitrate is not what he's referring to in audio, it's bit depth.
Bitrate / sample rate in audio can be 44.1K, 48K etc. For DV it's fixed at 13.5MHz. It's fairly common in audio to do the processing in 32bit float, SF does this.

Getting back to the original problem of processing VHS tapes. For some problems it's better to deal with the video in the analogue realm before its digitised. The A->D converters are only outputting 8 bit typically, some have 14 bit processing so making adjustments in the capture device if it supports it or using proc amps before the conversion could give better results.

Bob.
Bill Ravens wrote on 1/31/2008, 1:04 PM
DOH!
Thanx for correcting me, Bob.
See what I mean?
drmathprog wrote on 1/31/2008, 1:11 PM
I could be wrong, but I think the way it goes in audio is as follows: the two key parameters are "bit depth" and "sample rate". "Sample rate" measures the number of samples per unit of time. Common values are 44,100 samples per second (CD quality), 48000 samples per second, etc. "Bit depth" measures the number of bits used to represent each sample. In fixed depth formats, 16, 20 and 24 bits per sample are common. CD quality, for example, is 44.1K 16 bit.

It is my understanding that "8 bit video" means bit pixel sample depth, wherein 8 bits are used for each pixel to store each of the video channels; i.e., in YCbCr 8 bits store Luma, 8 bits store R-Y and 8 bits store B-Y. To complete the analogy, I believe in video sample rate is a little trickier. In NTSC, I believe the sampling frequency is 13.5 MHz, which I think means that pixels are sample from the video scan line at the rate of 13.5M samples per second. If this is incorrect, I hope someone will kindly correct me.
GlennChan wrote on 1/31/2008, 3:39 PM
Check out johnmeyer's posts on VHS restoration... that is likely the most important thing to do by far. Use the search feature in this forum.
One of his posts: http://www.sonycreativesoftware.com/forums/ShowMessage.asp?ForumID=4&MessageID=571891

In your case going 32-bit likely will not make a visible difference at all.

------------------------------------------------------
2- esoteric bit depth and precision stuff that you don't have to worry about:

An example of how the signals flows in Vegas:
Suppose the video is digitzed into a 8-bit Y'CbCr format (DV uses this).
Y' stores the luma info, with black level at code 16 and white level at code 235. e.g. if you capped the lens on the camera, all the values would be around code 16.
Cb and Cr store chroma information (think of it as the color information).

The Vegas DV codec will convert from Y'CbCr to studio RGB. Studio RGB is RGB color space with black at 16 and white at 235.
The conversion can be represented as a 3x3 matrix. The results tend to have decimal places... these will get rounded off and you'll have rounding error here. I believe the rounding is done in some intelligent fashion, i.e. not truncation, since the DV codec holds up very well across many many generations. *There is also quantization error in the DCT transform of the Vegas DV codec. It looks like it's handled very well.

If you look at the formulas for converting Rec. 601 or Rec. 709 Y'CbCr <--> RGB, there are some really ugly numbers in there that causes more rounding error than more intelligent numbers. e.g. Y' has a scale from 16-235, but Cb and Cr have a scale from 16-240. In Charles Poynton's opinion, that was simply a bad idea.

Suppose you apply FX onto your clips. Again you pick up rounding error here. The old levels FX generally does something like f(x) = gain * x + some constant. It tends to generate decimal places. The filter will *truncate* these values, not round them. So you have the really nasty kind of rounding error going on there.

The cross dissolve had that behaviour too (not sure what it does now).

Ok, now we render the video out into some format. If rendering to Windows Media Player, we'll need to convert from studio RGB levels to computer RGB levels. Yep, you guessed it- more rounding error. (The Color Corrector filter will do a better job here than levels, since the color corrector doesn't truncate.)

Long story short, Vegas' 8-bit pipeline can be prone to *a lot* of rounding error. And I don't believe it does any dithering tricks.

*decimal places is my lazy shorthand for saying that the calculations produce results that can't be represented as a 8-bit integer.

**I believe because of the Video for Windows interface, filters receive/output 8-bit values. Internally, they can process at whatever bit depth they want.

2b- But you know what? The human visual system has particular limitations... you probably won't see any banding artifacts unless the source has large gradients and is noise-free.

As a counterpoint, the gradient generator does show some banding artifacts.

*Assuming an ideal display, not a 6-bit LCD consumer panel (because that can introduce banding artifacts; though it also depends on how good the dithering is... there are many techniques).

2d- Doing all the intermediate calculations above with 32-bit floating point numbers will mostly improve things. But there are some cases where precision is not perfect (e.g. off by 1). I think that situation is when rendering to the 32-bit uncompressed format and bringing it back into Vegas (can't remember if that was the case).

2e- In 32-bit, there is a ?bug/design flaw? where linear light values are sent to 8-bit transitions (3rd party transitions generally only support 8-bit in/out). There is *very* noticeable and extreme quantization error (more than +-10 I think).
------------------------------------------------------

In practice: Don't worry about this precision / bit depth stuff unless your video looks wrong.

For VHS restoration, check out johnmeyer's posts.