WOT: 10bit vs 8bit camera codec

Serena wrote on 8/16/2010, 11:05 PM
http://www.cinematography.net/read/messages?id=18371310bit vs 8bit[/link] is a thread that I found quite interesting because it became a debate about the significance of 10bit vs 8bit recording from the camera. A matter separate from the number of bits for post processing. The basic question was about putting 10bit HDSDI into an 8bit Nanoflash, but a comment about dynamic range led to the interesting bit. The engineering argument, based on sensor thermal noise level, seemed sound enough in supporting the adequacy of 8bits , but didn't convince people who trusted only their eyes (notably people experienced in using high end cameras). I'm still pondering.

Comments

farss wrote on 8/17/2010, 12:23 AM
From my meagre understand a question that might be considered is what kind of 10 bit?
The 10bit Log system (Cineon etc) is a very different beast to 10bit as used on DigiBeta.

Bob.
Serena wrote on 8/17/2010, 2:05 AM
Yes. The argument is that for a camera sensor noisier than -56dB (the EX is -54dB) then 8 bits is sufficient to resolve that noise, so adding more bits for greater resolution is ineffective and just loads up the codec processing. "All that a Log curve or S curve does is alter the distribution of the chosen dynamic range over the number of bits being used to record it. So the old rule of thumb that 8 bits equals 7 stops is these days pretty meaningless".
Unfortunately I'm get a bit confused working down the processing chain from sensor to output when the log curve is applied, taking into account the 11 stop dynamic range of the EX, that the sensor is an analogue device and the processing is done at 14bits. I get waylaid by thinking that dynamic range is about brightness and that the more bits there are the better can we resolve the output curve out put by the processor. Probably ought to think through this some more before adding to my confusion!
PeterDuke wrote on 8/17/2010, 2:10 AM
The 8-bit follows a gamma power law, does it not? Typically gamma = 2.2.
Bill Ravens wrote on 8/17/2010, 4:07 AM
Not sure if I can offer any words that would help, Serena. But, I'll try.
Dynamic range is not specifically "brightness", but, the change of brightness over the image frame. The camera sensor has two qualities of performance(not sure what the technical names are). On the low end there is threshold noise or dark current(the output current with no light impinging on the sensor). On the high end, there is saturation current where so much light is hitting the sensor that its output is swamped and no further increase in light level impinging on the sensor creates additional output current. The sensor "sees" the image characteristics with this range of limit values. Because of the historic nature of film science, in which photographers are used to the terms f-stop and dynamic range, the industry equates dynamic range of the image to f-stops of light across the image plane..

I suppose, we could talk about the rate of change of light values(rate of change of signal amplitude) over the image, but, perhaps that's OT, because that's really a contrast issue. but, the ability of the individual sensor pixels to react to a sudden change of scene brightness(as in looking at a shadow against an edge of bright light hot spot) is also a performance issue. Or, conversely, for two adjacent sensor pixels to distinguish a change in R G B or Y value for a very small gradient in the image frame. There is no point in using 10 bit precision when the sensor sensitivity is only 10 bit precision. As you probably know, the rule of thumb for detection systems is to have a sensor sensitivity 10 times greater than the variable being measured.

The argument you referenced is wanting to correlate number of mathematical bits of sensor accuracy to overall detector dynamic performance..i.e. from pixel to pixel. I'm not so sure there is a pragmatic correlation as this argument suggests. But, that's just my naiive opinion.
farss wrote on 8/17/2010, 5:55 AM
"The 8-bit follows a gamma power law, does it not? Typically gamma = 2.2."

In general yes, the "Cine Alta" cameras and a few others do let you put kinks in the curve though. Mostly to record a little bit more of the specular highlight's detail.

Bob.

Serena wrote on 8/17/2010, 6:10 AM
Bill, a good description. Yes, I was very loose with "brightness", for certainly it is relative and not absolute. Perhaps what I'm thinking about is the resolution of sensor voltage. The sensor is a linear device, where the voltage is directly proportional the number of photons collected (per pixel) over the exposure time (plus, as you say, thermal noise etc). OK to that point. With what resolution can that voltage be read? That quantising level seems to me would determine the maximum bit depth that could be utilised. Clearly S/N is important for relative levels between adjacent pixels (say recording nominally the same number of photons). Accordingly I guess the maximum resolution is limited to the level of signal noise. But here I'm out of my depth and have fallen into the trap of proving my ignorance!
PeterDuke wrote on 8/17/2010, 7:05 AM
Using a higher resolution analog to digital converter would allow you to resolve finer differences in light intensity.

There is a concept known as "difference limen" or "just noticeable differences" that applies to our senses. What that means is there is no point resolving differences smaller than we can perceive. So you look at the range of intensity from overload to noise level and break it up into JNDs. That tells you how many steps you need and hence how many bits in the data sample. Eight bits would give you 256 while 10 bits 1024. Since JNDs are usually proportional to the intensity, then a logarithmic or near equivalent law is used rather than a linear law. How many steps are too many? I don't know, but my guess is that 8 bits was initially settled on since it is one byte and is "about right". In order to conserve data space it is likely that they have erred on the side of too little rather than too much.
farss wrote on 8/17/2010, 7:43 AM
It's all a bit more complicated than that.
1) Generally good cameras employ 14 bit ADCs and signal processing. Even dropping down to 12 bit seems to produce substandard video (original F900).
2) We're talking about the luma values only and that might be a mistake.
3) The assumption is that if we exposed a grey card so it reads 50IRE then 100IRE is about the point of sensor overload. I'm not so certain. The XDCAM EX cameras can handle 450IRE. The Cinegamma curves crunch that up into the top 9% of the data values but in theory with 10bits to play with highlights could be better preserved.
4) JND values are fine if you're not doing anything with the vision between acquisition and display.

All in all I'm in the same boat as Serena, not convinced either way.

Bob.
PeterDuke wrote on 8/17/2010, 8:11 AM
Bob

Is the 14 bit ADC linear or logarithmic? If linear, is there a log transform (or other compression) between sensor output and ADC? If no, then you would need many more bits than strictly necessary.
farss wrote on 8/17/2010, 8:44 AM
Good question and I don't know the answer. Camera manufacturers don't give much info away.
I'd hazard a guess it's linear as in many cameras you can change the gamma curve and that'd be pretty hard to do if it was fixed in the ADC. Also the RGB to Y', Cr, Cb conversion I think has to be done on linear light values as only the luma component (hence the "'" on the Y) is gamma adjusted.

Bob.
GlennChan wrote on 8/17/2010, 1:35 PM
The Y'CbCr conversion is done on gamma corrected values.

So... linear light RGB --> gamma corrected R'G'B' --> Y'CbCr (according to Poynton, it's ok to leave out the ' elsewhere in Y'CbCr since it's implicit that the other two components are also on gamma corrected values)

------------
There are probably weird situations where you can see problems with the 8-bit codec. What makes it worse is that you often have to convert from 8-bit Y'CbCr --> RGB (and maybe Y'CbCr and back), so you pick up rounding error in the process. For example, you will probably have to convert from 8-bit Y'CbCr to 8-bit R'G'B' for a LCD to display the picture (and most LCDs have less than 8-bits of performance because they have to correct for the panel and they may or may not have to implement white balance digitally).

So you might see problems if you have a great monitor (e.g. a 10-bit panel with great signal processing) and are shooting gradients.

2- In the grand scheme of things it's probably not a big deal.

----
The engineering argument, based on sensor thermal noise level, seemed sound enough in supporting the adequacy of 8bits
You have noise from the sensor and quantization noise from the recording format. You (usually) want to keep quantization noise below sensor noise.

For most cameras 8 bits will achieve this.

However, you can still potentially have problems with 8-bit Y'CbCr not being good enough.


The math of this is...

*The sensor outputs an analog voltage that is linear to light.
*It is converted into a digital signal, into maybe 12 or 14 bits of precision. Still linear light.
*Signal processing like white balance, gamma correct, and color correction is applied. Usually the DSP will apply shortcuts; doing things 100% correctly would require higher bit depth and that is costly (and also requires more power --> less battery life).
*The gamma correction may be something like f(x) = x ^ 1/2.2 (Cameras do something other than this... but they do something like this.)
You can think of this as a form of compression (although video engineers don't call it that). You are taking something that is ~12-14+ bits and compressing it down into 8 bits (the x^1/2.2 above allocates more bits to the shadows which need more bits). And visually this is acceptable. You need 10-12 bits if you want to handle extreme situations perfectly (e.g. noise-free, the image covers a huge part of your viewing angle, and the image consists of two different shades that are 1 bit apart).

The sensor noise is linear to light, or close to it.
Jeff9329 wrote on 8/17/2010, 1:55 PM
From the Panasonic brochure:

DSP with 14 bit A/D Conversion and 19 bit Processing
The digital signal processor developed for the AG-HMC150's video signals uses 14 bit A/D conversion and 19 bit inner processing to attain unprecedented accuracy. It is from this capture that all other signals are made.

Does anyone know what the 19 bit inner processing means? If you have a 10 bit camera, would it have to have a much higher DSP (24 bit) to produce a clean 10 bit signal?
GlennChan wrote on 8/17/2010, 2:41 PM
For the white balance, color correction, knee, and other signal processing... 19-bit numbers are being used.

- There is a small difference depending on whether the numbers are truncated or how they are rounded.
- Ideally you might want to use floating point numbers. But the only place that is done is on a desktop computer, like what Red does. Floating point math is weird and is its own beast. (It's complicated.)
- Sometimes marketing people figure out ways to twist specs around and make them meaningless because everybody is stretching the truth so much. (Though I don't think it's a problem in this case.)

If you have a 10 bit camera, would it have to have a much higher DSP (24 bit) to produce a clean 10 bit signal?
You need a number of things like a very low noise sensor....

The simple approach is to look at the end results in a practical/normal scenario and maybe rarer everyday scenarios. In those situations the bit depth is not going to matter much. You might see banding artifacts... it may be the camera that is causing them. Or in your post production software.
apit34356 wrote on 8/17/2010, 2:54 PM
not to be too boring........ but FFT is done in most DSP in real time, and FFT does not like integers..... sound and electrical noise are not identical in their FFT "fixes".
farss wrote on 8/17/2010, 3:03 PM
What I'm understanding is that for end to end delivery with nothing done to a correctly exposed, correctly lit image, between camera and display then 8 bpc is adequate. Based on that the question seems to come down to what about when that narrow confine is relaxed for example:

1) Heavy color correction.
2) Higher dynamic range images e.g. explosions, fire, specular highlights. Especially if they're being composited in post.
3) RAW camera data is being recorded
4) A wider color gamut is being used.

Bob.
BrianAK wrote on 8/17/2010, 3:08 PM
Well, if they are sampling at 14 bits then they only have 14 bits of info. From the website they mention they are performing simultaneous HD/SD processing, so maybe the 19 bit internal is related to this dual processing. I don't think there's a bonus 5 bits of sensor info being added.

In terms of display, as others have mentioned the human response of our eyes is like many things in nature and is non-linear. We can identify finer graduations of bright light then we can in the shadows. Most of us have 8 bit displays, so at the end of the day we often need to deliver an 8 bit stream.

I believe two areas of use for higher bit-rates is in the math related to post processing (where you get into round-off errors) and in adjusting the "exposure" of your image in the editor. If you capture 12 bits of data, there's more information within your image to adjust the exposure, keeping in mind at some point you have to render out to the bit depth of your display device.

There's always the question of how many bits you really have with respect to signal to noise ratio, but there's not much you can do about that so I personally don't worry about it too much. I often wonder if all the time I spend thinking about adjusting exposure and color is somewhat wasted energy as displays seem to vary so much. I somewhat doubt many people are seeing what I intended for them to see, but its probably pretty close.

Brian







GlennChan wrote on 8/17/2010, 9:12 PM
1) Heavy color correction.
In that case it's more important that the camera has little noise.

2) Higher dynamic range images e.g. explosions, fire, specular highlights. Especially if they're being composited in post.
Again, noise is usually the limiting factor.


If you are looking at a camera, knowing its bit depth is not that important IMO. Usually the camera has enough noise that the noise acts as a form of dither and it hides any artifacts associated with the lower bit depth.



3) The assumption is that if we exposed a grey card so it reads 50IRE then 100IRE is about the point of sensor overload.
That makes no sense... almost all cameras will record superwhite values that would be over 100IRE if converted to analog. Almost all cameras have signal processing that implements a "knee" function that crushes highlights and maybe tries to retain saturation in the highlights; this varies from manufacturer to manufacturer. The simplest thing to do is to run a test on your camera... shoot the same scene at various exposures. In Vegas, bring the superwhites into legal range if that suits your normal workflow (i.e. if your normal workflow can handle that extra step).

Most of us have 8 bit displays, so at the end of the day we often need to deliver an 8 bit stream.
While that would be intuitive, that's not the case because 8-bit Y'CbCr is not the same as 8-bit RGB.

The legal range for Y' is 16-235 (for 8-bit)... the other code values are used for illegal values (because analog formats may drift in their calibration and end up too low or too high; the digital formats have headroom for that; you also get illegal values from sharpening) and sometimes for synchronization. 8-bit Y'CbCr devotes less than 8bits to the picture that the viewer will see, and you lose a little bit more from rounding error when it gets converted to RGB.

If you want really nice 8-bit RGB, you want at least 10-bit Y'CbCr. That is why the SDI interface that all broadcasters use throughout their facilities are 10-bit Y'CbCr.

*On a practical level, a lot of news organizations store the footage they shoot as DV (which is 8-bit Y'CbCr). Commercials may also be stored in the DV format.
Serena wrote on 8/17/2010, 9:55 PM
I think the capability of the eye to discriminate fine differences in luminance and colour is not really relevant on the camera end of things, for we want to record as much information as possible for subsequent manipulation. The human eye has a latitude of about 13 stops and negative film up to 16 stops (including toe and shoulder), the EX's 11 stops is fairly respectable. Regarding the relationship between signal noise and read out resolution, it seems to me that there is nothing to be gained by reading the information much finer than the level of noise (just measure noise a little better). Or at any rate when that noise is dynamic. In a CCD the noise is largely a DC component of the signal, being contributed by thermal electrons and base current (dark noise). In astronomy, where S/N ratios are generally rather poor, much of the noise is removed by subtracting dark frames (taken at the same sensor temperature as the image frames). Somewhat similar techniques seem to be used in DSLRs, so I guess probably also in better video cameras. Of course doesn't work well if the stored dark doesn't match the current sensor temperature.
If all the "dark noise" is subtracted from the signal, then the limiting readout resolution should be related to pixel "well depth" (if the well will hold 8192 electrons then 13 bits are needed for maximum signal resolution). Of course subsequent processing adds noise (electronic and rounding) and introduces non-linearity (e.g.hyper-gamma), so all this leads me towards thinking that the useful bit depth in the output recording is not very directly related to the camera S/N ratio. But darned if I can argue that convincingly (not even to myself).
farss wrote on 8/17/2010, 11:35 PM
The thing that we don't know is what is the thing limiting the latitude of the EX cameras to 11 stops.
To really make any sense of this discussion and to resolve your quandery I think we need to go back to basics and work from there. I'd suggest by starting with an almost ideal sensor that gives a noise free voltage from starlight and doesn't saturate when pointed at the sun. If that seems improbable then consider a good microphone. No one records 8bit audio and 12bit is barely usable, 16 bit is pretty good but with the right gear and good mics in a quiet room 24bits can make sense. If the room isn't quiet or your preamps are noisy then 24bit is a waste.

Where I come unstuck with that simple analogy is I have no idea what the transducer in a camera is doing compared to the one in a microphone. I understand the thermal noise floor in the photodiodes etc. It's the other end of the range I don't get. If I overexpose then is the sensor clipping or is that happening elsewhere in the signal chain in the camera.

Bob.
Grazie wrote on 8/17/2010, 11:49 PM
From the "The Rock", To Sean : "I never saw you throw that gentleman off the balcony. All I care about is are you happy with your haircut."


apit34356 wrote on 8/17/2010, 11:54 PM
"If I overexpose then is the sensor clipping or is that happening elsewhere in the signal chain in the camera." The simple is answer....... one gallon jar can only hold one gallon...... but too much volume and it starts to bleed over to associate pixels, but can it damage the sampling circuit if un-lucky ..... usually it's the gate control to go. Example, A laser into the sensor will destroy the pixels with too much energy!
Grazie wrote on 8/18/2010, 12:00 AM
> A laser into the sensor will destroy the pixels

Good advice: Today I'm videoing some Tag welding. Well not the point contact, but the shadows and smoke cast as a result . . I ain't gonna point my sensors at that shebang!

Grazie
Serena wrote on 8/18/2010, 12:02 AM
<<<<If I overexpose then is the sensor clipping or is that happening elsewhere in the signal chain in the camera.>>>

The sensor has a capacity (well depth) and once that is reached more photons cannot generate more electrons; clipped. If we knew the well depth of the sensor then we could say something, but without that and knowledge of the processing chain we are just discussing principles. Interesting but really only revealing what we (or rather, I) don't know (turns an unknown unknowen into a knowen unknowen)
farss wrote on 8/18/2010, 2:13 AM
Perhaps the simple answer is to turn the question on its head i.e. is there any reason not to record 10bit, would doing so produce a worse outcome. I think we can all agree the asnwer is No.
So then it comes down to a practical question. what is the cost / benefit ratio? If it costs nothing to record 10bpc then it'd be kind of silly not too. If it requires a recorder as expensive as the camera and the benefit is at best trivial you'd be silly to do it.

Simple, practical example. When I first got my EX1 I mostly recorded in SP because SxS cards were expensive and scarce. Offloading was impossible. In fact my first concert shoot with my EX1 I had the camera connected to my M15 VCR recording to tape!

Today I've a clutch of cheap SDHC cards and adaptors and always record HQ. How much better it is than SP is moot, I very seriously doubt it's worse.

Same with my audio. TBH I seriously doubt recording in 24bit makes any real difference BUT why not, my recorder can record for days to its 40GB HDD so there's no downside. Again its kind of a pointless discussion.

Now in a practical sense given all the above at the very best recording from the HD SDI port on my EX1 to a 10 bit recorder is NOT going to be cheap and it's another box for me to lug around. If I had the money and the space for another bit of kit in the auditoriums I know what I'd have. A friggin 24" HD SDI monitor so I can get the shots in friggin focus. No one is going to convince me having the shot in focus doesn't matter :)

Bob.