HEVC 10-Bit 422 and 420: Howsit Going, Magix [VP19 update below]?

Comments

Richvideo wrote on 8/31/2021, 8:25 AM

NLE professionals love Nvidia

 

@Howard-Vigorita Try this file, make some cuts and remove small sections of the video. Playback Best - Half. Actually playback is not great with GPU decode off or on, but better with decode off because the decoder gets stuck for longer periods at edit points. 1080P60 files works better with GPU decode then on on for me they don't have the same pauses seen here with 4k

https://www.dropbox.com/s/yaakic68o9cz9io/Gopro%20Hero%208%20Black%20-%204K60%20Hypersmooth%202.0%20Sample-1_Cut.m4v?dl=0

 

.Can you do a comparison with your intel decoder , would be interesting to see?

I am running an AMD Ryzen 3900 so I do not have access to an Intel GPU. I have the GoPro 8 as well but I have not used the 60FPS option before, just 30P.

I am seeing 1 fps playback in any mode while using the RTX 2080 Super but if I turn it off and just use the CPU I get 60fps in preview mode-- that is utterly ridiculous.

 

 

Former user wrote on 8/31/2021, 8:30 AM

Yeah it's a bit random if it will even start to play with Nvidia GPU decode on. What I do is play then immediately pause and wait for GPU decoder to go to zero, then press play again. It will at least start the play back correctly but at edit points it will have problems again

Dexcon wrote on 8/31/2021, 8:37 AM

@Richvideo  ... your entire comment just above is totally in quotes - and quotes are usually meant to highlight a previous comment to which you add your own comment without the quote marks so as to differentiate your comment from the quoted comment.

It would be appreciated if you could edit your comment and remove the quotation marks from your comment over and above the previous comment to which you are commenting on. Otherwise, it's difficult to know where your comment starts.

Cameras: Sony FDR-AX100E; GoPro Hero 11 Black Creator Edition

Installed: Vegas Pro 16, 17, 18, 19, 20 & 21, HitFilm Pro 2021.3, DaVinci Resolve Studio 18.5, BCC 2023.5, Mocha Pro 2023, Ignite Pro, NBFX TotalFX 7, Neat NR, DVD Architect 6.0, MAGIX Travel Maps, Sound Forge Pro 16, SpectraLayers Pro 11, iZotope RX10 Advanced and many other iZ plugins, Vegasaur 4.0

Windows 11

Dell Alienware Aurora 11

10th Gen Intel i9 10900KF - 10 cores (20 threads) - 3.7 to 5.3 GHz

NVIDIA GeForce RTX 2080 SUPER 8GB GDDR6 - liquid cooled

64GB RAM - Dual Channel HyperX FURY DDR4 XMP at 3200MHz

C drive: 2TB Samsung 990 PCIe 4.0 NVMe M.2 PCIe SSD

D: drive: 4TB Samsung 870 SATA SSD (used for media for editing current projects)

E: drive: 2TB Samsung 870 SATA SSD

F: drive: 6TB WD 7200 rpm Black HDD 3.5"

Dell Ultrasharp 32" 4K Color Calibrated Monitor

Richvideo wrote on 8/31/2021, 8:43 AM

@Richvideo  ... your entire comment just above is totally in quotes - and quotes are usually meant to highlight a previous comment to which you add your own comment without the quote marks so as to differentiate your comment from the quoted comment.

It would be appreciated if you could edit your comment and remove the quotation marks from your comment over and above the previous comment to which you are commenting on. Otherwise, it's difficult to know where your comment starts.

Seems to be a bug in Edge or one of the extensions that I have installed that causes that, sorry

Dexcon wrote on 8/31/2021, 8:51 AM

@Richvideo  ... many thanks.

Cameras: Sony FDR-AX100E; GoPro Hero 11 Black Creator Edition

Installed: Vegas Pro 16, 17, 18, 19, 20 & 21, HitFilm Pro 2021.3, DaVinci Resolve Studio 18.5, BCC 2023.5, Mocha Pro 2023, Ignite Pro, NBFX TotalFX 7, Neat NR, DVD Architect 6.0, MAGIX Travel Maps, Sound Forge Pro 16, SpectraLayers Pro 11, iZotope RX10 Advanced and many other iZ plugins, Vegasaur 4.0

Windows 11

Dell Alienware Aurora 11

10th Gen Intel i9 10900KF - 10 cores (20 threads) - 3.7 to 5.3 GHz

NVIDIA GeForce RTX 2080 SUPER 8GB GDDR6 - liquid cooled

64GB RAM - Dual Channel HyperX FURY DDR4 XMP at 3200MHz

C drive: 2TB Samsung 990 PCIe 4.0 NVMe M.2 PCIe SSD

D: drive: 4TB Samsung 870 SATA SSD (used for media for editing current projects)

E: drive: 2TB Samsung 870 SATA SSD

F: drive: 6TB WD 7200 rpm Black HDD 3.5"

Dell Ultrasharp 32" 4K Color Calibrated Monitor

Howard-Vigorita wrote on 8/31/2021, 11:51 AM

NLE professionals love Nvidia

Certainly the open source community does. And they're liking Intel too since they started following Nvidia's lead providing open source sdk's. Apple and AMD are a bit deficient there. I've never found Intel igpu's that good as a sole gpu but the Iris line with the MAX is apparently getting better. But I can't test that myself because the 11900k's uhd750 does not have a MAX like some of their laptop and Nuc compute-element cpus.

@Howard-Vigorita Try this file, make some cuts and remove small sections of the video. Playback Best - Half. Actually playback is not great with GPU decode off or on, but better with decode off because the decoder gets stuck for longer periods at edit points. 1080P60 files works better with GPU decode then on on for me they don't have the same pauses seen here with 4k

That's avc 420, not hevc 422. Fwiw, I don't see that the uhd750 is any different from the 630 decoding 420 avc which does not benefit a whole lot from hardware decoding other than taking some of the load off the cpu. But it's generally higher performance if the decoding is done asynchronously and independently of other video processing.

What makes that footage difficult to edit is that it's double frame-rate. If it were 422, that would be a double-double wammy. Hevc 422 would probably help with a hardware decoder if the enhanced hevc compression were used to lower the bit rate. Personally, I wouldn't do 422-anything at double frame rate. But I can see you doing that frame rate with all the motion involved. What would be interesting to compare would be shooting single frame rate hevc 420 with the same subject matter... hevc being more optimized for motion. Btw, as a former motorcycle messenger for the NYC film/advertising industry in my youth, your footage takes me back.

AVsupport wrote on 8/31/2021, 5:33 PM

@Howard-Vigorita, please note that there is NO hardware acceleration from any manufacturer for HEVC 10-bit 422 at this stage, apart from latest Apple M1 and planned from Intel. However, there is for 420 and 444 from NVidia. So, one could edit 420 on the timeline without the need to proxy (as I am doing in Resolve); and perhaps transcode ('upscale') 422 to 444 which in turn again could be GPU accelerated - for keying work;

The other thing slowly starting to become 'the other new normal' is 50/60p with HEVC; Sony has omitted 25p as a choice in A7S3, and I guess FX3 (which is basically the same camera). My thought would be the intent being to phase out Interlaced as the world is heading towards progressive screen formats everywhere (check your smartphones, computer screens). From what I understand, Interlaced happens pretty much at 1/50-60th anyway just not the full frame, only half

https://en.wikipedia.org/wiki/Interlaced_video

my current Win10/64 system (latest drivers, water cooled) :

Intel Coffee Lake i5 Hexacore (unlocked, but not overclocked) 4.0 GHz on Z370 chipset board,

32GB (4x8GB Corsair Dual Channel DDR4-2133) XMP-3000 RAM,

Intel 600series 512GB M.2 SSD system drive running Win10/64 home automatic driver updates,

Crucial BX500 1TB EDIT 3D NAND SATA 2.5-inch SSD

2x 4TB 7200RPM NAS HGST data drive,

Intel HD630 iGPU - currently disabled in Bios,

nVidia GTX1060 6GB, always on latest [creator] drivers. nVidia HW acceleration enabled.

main screen 4K/50p 1ms scaled @175%, second screen 1920x1080/50p 1ms.

fr0sty wrote on 8/31/2021, 5:52 PM

You can't upscale 422 to 444... upscaling only affects pixel resolution, not color.

Systems:

Desktop

AMD Ryzen 7 1800x 8 core 16 thread at stock speed

64GB 3000mhz DDR4

Geforce RTX 3090

Windows 10

Laptop:

ASUS Zenbook Pro Duo 32GB (9980HK CPU, RTX 2060 GPU, dual 4K touch screens, main one OLED HDR)

Former user wrote on 8/31/2021, 6:05 PM
 

That's avc 420, not hevc 422. Fwiw, I don't see that the uhd750 is any different from the 630 decoding 420 avc which does not benefit a whole lot from hardware decoding other than taking some of the load off the cpu. But it's generally higher performance if the decoding is done asynchronously and independently of other video processing.

The question is more about a potential bug with Vegas GPU decoder on all systems, can't do comparisons between decoders with 422 video because only intel does it currently. When you see the lags at edit points can you look at your GPU decoder graph, does it correspond to 100% (or extremely high) GPU decode activity when the decode goes down to normal your video stop lagging? I don't see this GPU decode problem on Premiere.

What i'm trying to ascertain is if the problem you have with 422 playback in comparison to 420 is not actually in the main processing section but in the GPU decoding, just as the problem is seen in the extreme example I posted of the 4K60 AVC, but the same GPU decoder overload bug will happen on 1080p60 and to a lesser extent 30 fps HD/UHD video, when you see lagging at edit points it's most likely this GPU decoder problem, and not your cpu or GPU being overloaded. The lagging in this example happens because Vegas never gets to process the frames. It would be great if you could do a screen recording showing playback with task manager showing GPU decode. I believe the GPU decode bug is across all systems, but you said you don't see it with your intel systems, but after I"ve explained things a little more, do you see the GPU decode overload on your intels?

We can't expect Vegas to fix a problem where users can't even agree a problem exists @Howard-Vigorita

 

 

 

 

AVsupport wrote on 8/31/2021, 10:37 PM

You can't upscale 422 to 444... upscaling only affects pixel resolution, not color.

Perhaps not the correct terminology to use, I’d be happy to learn the correct word to use; but the idea being a simple recode / transcode process if that was possible, without having to go through another recompression stage.

/EDIT after some self-reeducation, perhaps it would be 'chroma-subsampling-upsampling' hence I'm not surprised no one uses it, as it sounds somewhat silly ;-)

Last changed by AVsupport on 8/31/2021, 11:59 PM, changed a total of 1 times.

my current Win10/64 system (latest drivers, water cooled) :

Intel Coffee Lake i5 Hexacore (unlocked, but not overclocked) 4.0 GHz on Z370 chipset board,

32GB (4x8GB Corsair Dual Channel DDR4-2133) XMP-3000 RAM,

Intel 600series 512GB M.2 SSD system drive running Win10/64 home automatic driver updates,

Crucial BX500 1TB EDIT 3D NAND SATA 2.5-inch SSD

2x 4TB 7200RPM NAS HGST data drive,

Intel HD630 iGPU - currently disabled in Bios,

nVidia GTX1060 6GB, always on latest [creator] drivers. nVidia HW acceleration enabled.

main screen 4K/50p 1ms scaled @175%, second screen 1920x1080/50p 1ms.

AVsupport wrote on 8/31/2021, 10:39 PM

@Former user This is perhaps a little off topic and better discussed in a separate thread; I was more interested in raising the awareness about HEVC and GPUs not iGPUs;

However I have also seen those lags between cuts, and my believe is it has to do with poor read-ahead management, pipeline issues, yet unfixed. Because what I can see is that the timeline playback fps becomes stable again once you re past a cut. Annoying

Last changed by AVsupport on 8/31/2021, 10:52 PM, changed a total of 1 times.

my current Win10/64 system (latest drivers, water cooled) :

Intel Coffee Lake i5 Hexacore (unlocked, but not overclocked) 4.0 GHz on Z370 chipset board,

32GB (4x8GB Corsair Dual Channel DDR4-2133) XMP-3000 RAM,

Intel 600series 512GB M.2 SSD system drive running Win10/64 home automatic driver updates,

Crucial BX500 1TB EDIT 3D NAND SATA 2.5-inch SSD

2x 4TB 7200RPM NAS HGST data drive,

Intel HD630 iGPU - currently disabled in Bios,

nVidia GTX1060 6GB, always on latest [creator] drivers. nVidia HW acceleration enabled.

main screen 4K/50p 1ms scaled @175%, second screen 1920x1080/50p 1ms.

Howard-Vigorita wrote on 8/31/2021, 10:56 PM

@Former user Sorry if that text to speech thing is like a new drug to me. This is strictly 420 60 fps 8-bit avc vrs 30 fps 10-bit hevc which as been my default shooting format for the past year or so.

I did the hevc transcode with this script:

ffmpeg -y -r 30 -i "Gopro Hero 8 Black - 4K60 Hypersmooth 2.0 Sample-1_Cut.m4v" -c:v libx265 -vf "format=yuv420p10le, setpts=0.5*PTS" -crf 22 -preset ultrafast -c:a copy 30fpsHevc420.mp4

It doesn't exhibit a decoding spike but preview rate gets crushed if I put in a crossfade... but that's easy enough to fix.  Will see how it goes next on an 11900k and see if I can illustrate hevc 422 decoding.

EDIT: Here's the 11900k doing the same plus hevc 422. This worked better than I expected.

Former user wrote on 9/1/2021, 7:50 AM
easy enough to fix.  Will see how it goes next on an 11900k and see if I can illustrate hevc 422 decoding.

EDIT: Here's the 11900k doing the same plus hevc 422. This worked better than I expected.

@AVsupport have a look at this video. it's 422 HEVC decode

@Howard-Vigorita That was a great demonstration, It was good to see both generation of Intel decoders in action, and English narrator is exellent. I would say both the intel GPU decoders are spiking, but not as dramatically as Nvidia's do, Your IGPU630 spikes to 75% and drops majority of frames, while your 750 spikes to about 35% while dropping frames, not sure of the significance of those numbers , maybe 750 is capable of double the maximum decode of 630

So I think we do see that decode overload problem with the intels also, so something that most likely occurs with all the GPU deocoders vegas uses, and is not specific to a particular brand or model, so you go for an Intel IGPU for maximum compatibility of codecs but the performance looks very similar. The GPU and CPU making the most difference between systems. It is good to see that 750 decoder handles 422 decoding as well as it does 420 but not so good to see I would still see the GPU decode stutter problem. However you've shown how best to deal with that

AVsupport wrote on 9/1/2021, 8:25 AM

Somewhat OK results for Intel, but no evidence of Radeon doing any Decode heavy lifting. In context, I wouldn't expect a built-in iGPU to be able to perform as good as a $x,xxx dedicated graphics card. That's just fine for watching youtube etc, but not video editing. But if the editing software doesn't support that, then there's trouble.

 

Last changed by AVsupport on 9/1/2021, 8:38 AM, changed a total of 2 times.

my current Win10/64 system (latest drivers, water cooled) :

Intel Coffee Lake i5 Hexacore (unlocked, but not overclocked) 4.0 GHz on Z370 chipset board,

32GB (4x8GB Corsair Dual Channel DDR4-2133) XMP-3000 RAM,

Intel 600series 512GB M.2 SSD system drive running Win10/64 home automatic driver updates,

Crucial BX500 1TB EDIT 3D NAND SATA 2.5-inch SSD

2x 4TB 7200RPM NAS HGST data drive,

Intel HD630 iGPU - currently disabled in Bios,

nVidia GTX1060 6GB, always on latest [creator] drivers. nVidia HW acceleration enabled.

main screen 4K/50p 1ms scaled @175%, second screen 1920x1080/50p 1ms.

Former user wrote on 9/1/2021, 9:34 AM

Somewhat OK results for Intel, but no evidence of Radeon doing any Decode heavy lifting. In context, I wouldn't expect a built-in iGPU to be able to perform as good as a $x,xxx dedicated graphics card.

When you look at the IGPU graphs you see the decoder plus the weak GPU processing box, the graphs are mirroring each other, this is because the GPU decoder also requires a small amount of GPU processing to work, but Vegas is not using the IGPU for any processing, the 5700XT is doing that. The power of the IGPU Processor is of no interest to Vegas, it's only job is to work synchronously with the decoder and probably used to display the preview screen.It has all the power it needs to do that.

The 5700XT is not doing much work because it's a powerful card, and Vegas is doing the bare minimum of playing back a file with no effects or edits, but look at the edit point at 3:45, 5700XT processing goes up because it has something extra to process due to the transition

 

 

 
 
 

 

Howard-Vigorita wrote on 9/1/2021, 12:02 PM

Somewhat OK results for Intel, but no evidence of Radeon doing any Decode heavy lifting. In context, I wouldn't expect a built-in iGPU to be able to perform as good as a $x,xxx dedicated graphics card.

@AVsupport I've tested it every which way on 3 machines and the Intel igpu isn't very good at timeline processing on any of them... that is mixing streams and applying fx. It's best at decoding and displaying. In this case I have the display plugged into the pcie board so it's only doing decoding. Which is fine.

Howard-Vigorita wrote on 9/1/2021, 1:01 PM
When you look at the IGPU graphs you see the decoder plus the weak GPU processing box, the graphs are mirroring each other, this is because the GPU decoder also requires a small amount of GPU processing to work, but Vegas is not using the IGPU for any processing, the 5700XT is doing that. The power of the IGPU Processor is of no interest to Vegas, it's only job is to work synchronously with the decoder and probably used to display the preview screen.It has all the power it needs to do that.

The 5700XT is not doing much work because it's a powerful card, and Vegas is doing the bare minimum of playing back a file with no effects or edits, but look at the edit point at 3:45, 5700XT processing goes up because it has something extra to process due to the transition

@Former user The displays are plugged into the 5700xt and Radeon7 so all the igpus are doing is decoding as set in the Vegas I/O screens. On the 11900k, the 5700xt is also doing the screen capture using the amd ReLive facility while I used Intel uhd630 capture on the 9900k machine... neither of which impacted functionality which I tested earlier with the same results.

I interpret Task Manager utilization numbers differently. I don't find reported metrics very accurate as they depend on what and how the driver reports. In the case of Intel, the cpu has a mind of its own and uses its igpu as a processor extension whether asked to or not... cpu temperature is more accurately indicative of how hard its working. In the case of amd, I often see decoding and encoding reported on an unexpected internal processor. Whose graph is only displayed if you manually set Task Manager to display the right 2 or 3 of the many processors in the gpu. IMHO, the utilization metrics are only dependably indicative of whether Vegas is seeing the gpu. Keep in mind that video processing is like a chain or train... no car can go any faster that those ahead of it. How hard any particular process in the chain might be working is less relevant than the overall performance getting the job done. If you see a saturated member, however, that's relevant in finding the weakest link.

Btw, video boards run asynchronously under their own clocks, limited only by the pcie bus and chains of events which make them wait. Video boards also have multiple independent clocks that enable them to do even more internal parallel processing. Unlike a cpu whose cores all run under a single clock and are all time-shared in their operation.

AVsupport wrote on 9/1/2021, 6:52 PM

but Vegas is not using the IGPU for any processing, the 5700XT is doing that.

Not convinced if that is true. In the video you can only see the Intel's (iGPU) decode activity in the Task manager, but not the Radeon. It would suggest the Radeon's activity simply being a data throughput to the monitors, hence the low usage, but not timeline decoding (which we want VP to do).

Maybe @Howard-Vigorita can shed some light on this [I think post above has already done that], perhaps with a Schwarzenegger-sounding voice-over ;-)) ??

PS, I think when putting a cursor down the timeline, VP will preload a clip. If you leave the cursor where it is, activity will disappear after a while and playback 'should' be smooth. But yes, this doesn't work for cuts/crossfades.

 

my current Win10/64 system (latest drivers, water cooled) :

Intel Coffee Lake i5 Hexacore (unlocked, but not overclocked) 4.0 GHz on Z370 chipset board,

32GB (4x8GB Corsair Dual Channel DDR4-2133) XMP-3000 RAM,

Intel 600series 512GB M.2 SSD system drive running Win10/64 home automatic driver updates,

Crucial BX500 1TB EDIT 3D NAND SATA 2.5-inch SSD

2x 4TB 7200RPM NAS HGST data drive,

Intel HD630 iGPU - currently disabled in Bios,

nVidia GTX1060 6GB, always on latest [creator] drivers. nVidia HW acceleration enabled.

main screen 4K/50p 1ms scaled @175%, second screen 1920x1080/50p 1ms.

Former user wrote on 9/1/2021, 7:28 PM
In the case of amd, I often see decoding and encoding reported on an unexpected internal processor. Whose graph is only displayed if you manually set Task Manager to display the right 2 or 3 of the many processors in the gpu

Vegas uses 3D and Cuda engines on Nvidia, Intel and AMD would have their own versions

 

 
IMHO, the utilization metrics are only dependably indicative of whether Vegas is seeing the gpu. Keep in mind that video processing is like a chain or train... no car can go any faster that those ahead of it. How hard any particular process in the chain might be working is less relevant than the overall performance getting the job done. If you see a saturated member, however, that's relevant in finding the weakest link.

 

When Vegas lags and the GPU decoder spikes, nothing is being processed, the frames just disappear. and that's the bug. In 1080P60 AVC you can get almost perfect playback over edit points with GPU decoder OFF, but with GPU decoder on, the GPU decoder overloads and frames disappear.It's a coding bug, it's not because Vegas can't process the frames due to being overloaded, it's not doing anything except playback of original video. I don't think you have Premiere, but you have Resolve Studio, you can see how it's supposed to work. A GPU decoder should never make playback frame rate worse

Making excuses for a bug is the worst possible thing you can do

 

 

Former user wrote on 9/1/2021, 7:35 PM

but Vegas is not using the IGPU for any processing, the 5700XT is doing that.

Not convinced if that is true. In the video you can only see the Intel's (iGPU) decode activity in the Task manager, but not the Radeon. It would suggest the Radeon's activity simply being a data throughput to the monitors, hence the low usage, but not timeline decoding (which we want VP to do).

In the video I quoted you see the 5700xt at the bottom, watch when playback goes over the transition, it's processing goes up

 

 

PS, I think when putting a cursor down the timeline, VP will preload a clip. If you leave the cursor where it is, activity will disappear after a while and playback 'should' be smooth. But yes, this doesn't work for cuts/crossfades.

 

Yes that's exactly what it does. and that is where the decoder spiking works as it should, move the play point around the time line, watch the GPU decoder spike, this is entirely normal and what you want, a quick burst of a lot frames for Vegas to cache ready for hopefully a smooth playback beginning, and the reason we use edit points in these examples is to remove this advantage, we want to look at Vegas rendering the timeline without cached video

adis-a3097 wrote on 9/1/2021, 7:35 PM

Vegas uses Cuda, that true?

Former user wrote on 9/1/2021, 7:58 PM

Nvidia cards use the Cuda engine in Parallel with 3D in Vegas. It could be doing a conversion

Howard-Vigorita wrote on 9/1/2021, 9:01 PM
Not convinced if that is true. In the video you can only see the Intel's (iGPU) decode activity in the Task manager, but not the Radeon. It would suggest the Radeon's activity simply being a data throughput to the monitors, hence the low usage, but not timeline decoding (which we want VP to do).

@AVsupport Vegas was set to use the Intel igpu for decoding in all cases. Ideally, it should happen once and only once by the assigned processor, as the media is read off of disk. I did have display optimization checked on both machines but that only applies to the gpu hdmi port that the display cable is plugged into... which was the pcie amd video board on both machines. Fwiw, I've noticed I get slightly faster qsv renders while plugged into motherboard hdmi, but faster vce renders overall when plugged into pcie hdmi which has become my practice.

Howard-Vigorita wrote on 9/1/2021, 9:57 PM
Making excuses for a bug is the worst possible thing you can do

@Former user I think you misunderstand my intent. It's not to excuse Vegas developers but to present work-arounds and explore possibly better ways of getting the job done.