10 Bit Video in Vegas Pro 22 - What Project Settings to Use?

Wolfgang S. wrote on 9/7/2024, 12:52 AM

This discussion has become much too theoretical. Yes, Michael is right that you do not see a difference between 420 and 422. But 422 still has advantages for all kind of grading and color correction. For that reason, if you intend to grade the footage, it is a good idea to stick to 422.

Howard-Vigorita wrote on 9/7/2024, 11:12 AM

However, 4:2:0/4:2:2/4:4:4 has to do with YUV compression chroma sub-sampling of the image after capture, not physical pixel characteristics.

@fr0sty Howard won't ever change his mind about this, but for the education of others read why it's possible to have real 422 color from a single sensor and why this isn't a camera conspiracy.

https://cinematography.com/index.php?/forums/topic/72071-is-422-from-a-4k-sensor-real-4k-422/

@Former user @fr0sty Ha, ha, no I won't. Here's why:

4:2:0 doesn't encode all the pixel-elements present in RGB... it encodes pixels in pairs of Blue&Green and Green&Red. This approximates the relative visual distinguish-ability of life-forms that evolved on our predominately Green planet. Heightened sensitivity to more shades of Green generally maximizes life-form survival on Earth against predators hiding in the environment while enhancing the ability to distinguish between nutritious and harmful plants. The 4:2:0 encoding scheme capitalizes on human visual optimization by accommodating twice as many Green elements compared to Red or Blue. Single-sensor cameras are physically optimized for 4:2:0 encoding with color filter stripes painted across their sensors, with the Green stripes 2x as wide as Red or Blue. This results in an array of pixels that is significantly more compact than arranging the filter stripes to realize a true RGB array requiring 3 elements per pixel. The two-element design got adopted universally when 4K came along and it became apparent that folks could readily see the difference in resolution with more less-costly pixels, as opposed to fewer pixels with more color elements than can be discerned. Space aliens might see things differently.

Fwiw, 4:4:4 encodes all 3 elements into each stored pixel while 4:2:2 splits the difference. In both cases, upscaling images from 2-element sensors requires that the missing elements be synthesized. Doing that upscale to 4:2:2 in the camera adds roughly 25% more data. Requiring faster media to write the same amount of data actually captured by the sensor. Doing the upscale just-in-time before transfer to an RGB monitor eases demands on the camera battery, media, and video editor. But gets to the same place in the end. And creates the opportunity for better results with higher precision not feasible inside a low-powered camera.

Personally, I find it encouraging for Sony to be providing more encoding options. Zcam has been making cameras for years with Sony-supplied Exmor sensors and firmware that provide the same encoding flexibility.

fr0sty wrote on 9/7/2024, 12:03 PM

Single-sensor cameras are physically optimized for 4:2:0 encoding

You're leaving out the key element... At the camera's native resolution.

Very few cameras shoot at 4K natively, the vast majority shoot at a much higher native resolution and then downscale, not upscale, to hit 4k. As I pointed out in my last post, the number of physical red and blue photosensors, even if at only 25% of the total number of pixels on the sensor, often times still has more than enough to capture a true 4:2:2, if not 4:4:4 image when downsampled to 4k.

Howard-Vigorita wrote on 9/7/2024, 1:28 PM

You're leaving out the key element... At the camera's native resolution.

Even if cameras did that, I don't see how it would help. If a 6k or 8k sensor only captures 2 elements per pixel, which they all do, the missing elements would still need to be synthesized and the most you'd get is propagating the error-components with low-res math in-camera. Although if it's anything like downscaling 4k to HD, I'm sure it would look better anyway, regardless of sub-sampling upscale. But I don't think cameras actually do that. All the ones I've seen downsize framing by cropping, not mathematical scaling or pixel averaging. Probably to save power and process in real time. Cropping is obvious when it makes a given lens more telephoto. Although a speed booster could be engaged to trade in telephoto for an f-stop or two.

Downsizing in Vegas would be a different story, however. But frames would have to be delivered uncropped at native size. That would give Vegas the opportunity to employ more power-hungry processing. I know when I shoot 4k, put it in a 4k project, and render HD, it looks better that shooting HD without any special help, like converting to 4:2:2 ahead of time.

I can also tell you, that I struggled rendering 4:2:2 HD in Vegas for a decade from 3-CCD cameras that downscale from true RGB capture to record 8-bit 4:2:2 HD. And that 4k 4:2:0 hevc not only renders faster all around, its color looks better. CPUs got faster over that period so processing 4:2:2 HD isn't as bad now as it used to be... I still use my aging Canon xf305 in 4k hevc multicam projects and still see renders noticeably slow down when they hit its footage.

alifftudm95 wrote on 9/7/2024, 1:52 PM

From my personal experience, I would suggest stick to regular 8-bit Full Range :)

Color Grade is heavily relied on creative input & personal taste. I would not go too technical on VEGAS when it comes to colors. It just not there yet imo.

Here some stills Graded in VEGAS software.

And here some stills graded in Resolve

Just hope VEGAS support Open Timeline OTIO or fix the broken XML export so that we can easily

collaborate with another Colorist :D

(Psst, I have someone made for me a working, fully functional EDL script for sending baked export from VEGAS to another NLE or Resolve for Coloring purposes)

Wolfgang S. wrote on 9/7/2024, 3:14 PM

I really wonder what the threadstarter will do with those theoretical comments. All, what he wants to know, was how to work with 10bit footage in Vegas. Hmm

Howard-Vigorita wrote on 9/7/2024, 4:29 PM

I would assume more review videos. Perhaps covering cameras and video editors.

fr0sty wrote on 9/7/2024, 5:44 PM

All the ones I've seen downsize framing by cropping, not mathematical scaling or pixel averaging.

I know for a fact my Panasonic S1's do not crop in on the sensor to hit 4K (other than to crop it to a 16:9 frame), they use every pixel when recording 4K at 30fps or less, when recording at 60fps, it crops in to an APS-C sized sensor. As far as I can tell, the Sony ZV-E10 does not crop 4K unless you record above 24p, it does crop at 30p and above.

Even if cameras did that, I don't see how it would help. If a 6k or 8k sensor only captures 2 elements per pixel, which they all do, the missing elements would still need to be synthesized

That's assuming you're dealing with a 6k or 8k native sensor, and your output is going to be 6k or 8k... not a 26mp sensor like we have here being downsampled to 4K. There are physical color elements there to sample from. If, at 24mp, you have 25% of those pixels as red, and 25% of them as blue, then you have 6 million pixels of each to utilize when downsampling to 4K. Cropping that to 16:9 gives you 6000x3375 pixels, and 25% of that is 5,062,500 pixels, more than enough to sample 4K at half resolution needed to pull off 4:2:2 (RGB), which would be 4,147,200 pixels for red and blue.

The reason I put (RGB) in there is, again, this is where it differs from the way YUV compression works, as we're capturing it as RGB (so 10 bits for red, 10 for green, 10 for blue, per resulting pixel after debayering), not YUV, which only consists of 2 coordinates for color (because it plots the color space onto an XY graph) and one for brightness, so this is kinda apples to oranges from the get-go. 4:2:0, 4:2:2, and 4:4:4 refer to YUV chroma subsampling, which works completely differently than RGB. Camera sensors capture RGB values. The conversion to YUV, and the subsequent chroma sub-sampling, occurs after that point, during the encoding process.

I really wonder what the threadstarter will do with those theoretical comments. All, what he wants to know, was how to work with 10bit footage in Vegas. Hmm

We already answered their question about 10 bit, and this discussion does relate to whether 4:2:2 is useful or not, which is the other part of the OP's initial question.

Howard-Vigorita wrote on 9/7/2024, 7:05 PM

4:2:0, 4:2:2, and 4:4:4 refer to YUV chroma subsampling, which works completely differently than RGB. Camera sensors capture RGB values.

Lets not confuse things by getting into how to get from RGB to 4:2:0 Y'CbCr and back again. Which is nothing more that clever encoding, adding and subtracting values to worm more significant decimal places of high-densigty G into low-density R&B space. As opposed to storing each RGB value separately in equal-sized containers. I suppose you could call that easy to encode/decode lossless compression.

I was just looking at the S1R sensor specs. Claims to be actual 50.44 Megapixel but effective: 47.3 Megapixel (8368 x 5584). But cannot record 6k or or 8k frame sizes. I think camera manufacturers are now hyping sensor sizes by counting what they sometimes call photo-sites, which are pixel-elements, rather frame pixels. Or it's a coincidence that if you divide each dimension by 2 you get only a handful of pixels more than a dci 4k frame size. I'm guessing the extra pixels on the edges come into play for digital stabilization. They do claim to be doing pixel-binning which is a kind of proximity averaging. But that won't help with missing elements in what appears to still be a 2-element per frame-pixel scheme. Missing elements are handled by synthetic debayering, as usual. Fwiw however, the full-frame S1R DxOMark performance and quality analysis is quite good compared to other cameras playing the same numbers game.

fr0sty wrote on 9/7/2024, 8:17 PM

The S1R is not meant to be a video camera, it targets stills primarily, and its video features are far more limited than the S1. The S1 is the video variant of the S series (and the S1H, which does shoot 6k).

Or it's a coincidence that if you divide each dimension by 2 you get only a handful of pixels more than a dci 4k frame size.

47.3MP (8368x5584) = 46,726,912 pixels.

DCI 4K (4096x2160) = 8,847,360 pixels. 47.3MP has 5.2x more pixels.

Howard-Vigorita wrote on 9/8/2024, 2:32 PM

@fr0sty Curiously, the S1H specifies the same pixel count as the S1R. Wonder what's up with that. Marketing strikes again. However the S1H can record 6k video. User Manual on p100 specifies the maximum frame size without major cropping:

6k: 5888x3312 = 19,501,056 frame-pixels (16:9 aspect). Again, suggesting 2-elements per frame-pixel with minor cropping.

A white-paper on the S1H provides more detail on major & minor cropping with different framing.

Looking more closely at the Sony SV-10M2, it's a shame its hevc requires long-gop. But hevc will capture the highest quality. There are many options for 4:2:2 and 4:2:0 and various bitrates. But the only gpu that can decode 4:2:2 is Intel and only if it's hevc. And there's nothing but the Hobson's choice of 4:2:2 to avoid long-gop. 10-bit Hevc 4:2:0 long-gop can be decoded with most newer gpus. And 8-bit avc 4:2:0 long-gop can be decoded with almost anything. But lack of intra hurts for editing. None of the options are ideal so I suggest OP try a few formats on his own setup and see which works best. Or consider editing with proxies or transcoding with ShutterEncoder.

fr0sty wrote on 9/8/2024, 5:59 PM

@fr0sty Curiously, the S1H specifies the same pixel count as the S1R. Wonder what's up with that. Marketing strikes again. However the S1H can record 6k video. User Manual on p100 specifies the maximum frame size without major cropping:

6k: 5888x3312 = 19,501,056 frame-pixels (16:9 aspect). Again, suggesting 2-elements per frame-pixel with minor cropping.

A white-paper on the S1H provides more detail on major & minor cropping with different framing.

Looking more closely at the Sony SV-10M2, it's a shame its hevc requires long-gop. But hevc will capture the highest quality. There are many options for 4:2:2 and 4:2:0 and various bitrates. But the only gpu that can decode 4:2:2 is Intel and only if it's hevc. And there's nothing but the Hobson's choice of 4:2:2 to avoid long-gop. 10-bit Hevc 4:2:0 long-gop can be decoded with most newer gpus. And 8-bit avc 4:2:0 long-gop can be decoded with almost anything. But lack of intra hurts for editing. None of the options are ideal so I suggest OP try a few formats on his own setup and see which works best. Or consider editing with proxies or transcoding with ShutterEncoder.

The Panasonic Lumix DC-S1H has 24.2 megapixels of effective pixels and 25.28 megapixels of total pixels.

Former user wrote on 9/8/2024, 6:17 PM

@Howard-Vigorita Howard you might find this video interesting, it's very old so probably have watched it at one stage, He has great difficulty seeing differences between 420 and 422, would you conclude he should see more difference if it was real high quality 422?

You like testing and experimenting, are you able to create real 422 via down sampling methods or actual footage from a 3x CCD/CMOS camera and can show the difference between camera processing engine 422 and real 422?