VRAM Bottleneck in Vegas Pro 22 – Considering GPU Upgrade to RX 7900 X

Comments

bitman wrote on 11/26/2024, 2:38 AM

It is obvious that dual encoders offer an advantage for multiple stream encoding in parallel, but if dual encoders are sequentially alternated for single stream encoding, that offers also an advantage, it is better for physically spreading the heat generated in the chip, avoiding hot spots...

APPS: VIDEO: VP 365 suite (VP 22 build 194) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K (upgraded my former CPU i9-12900K),
  • Air Cooler: Noctua NH-D15 G2 HBC (September 2024 upgrade from Noctua NH-D15s)
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: ASUS GeForce RTX 3090 TUF OC GAMING (24GB) 
  • Monitor: LG 38 inch ultra-wide (21x9) - Resolution: 3840x1600
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

bitman wrote on 11/26/2024, 2:57 AM

But going back back to the poster's subject of upgrading, I think for video editing having enough VRAM is key, and it will only rise in the future. Even games are needing more VRAM year after year. Then there is stability, software will always suffer at some point of memory leaks bugs. If you have more VRAM than needed, you will crash less frequent as you have some reserve for memory leaks!

The next generation NVIDEA 5000 series has more VRAM than the 4000 series (e.g. 5090 will have 32 GB). Just to say the need for VRAM is rising.

APPS: VIDEO: VP 365 suite (VP 22 build 194) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K (upgraded my former CPU i9-12900K),
  • Air Cooler: Noctua NH-D15 G2 HBC (September 2024 upgrade from Noctua NH-D15s)
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: ASUS GeForce RTX 3090 TUF OC GAMING (24GB) 
  • Monitor: LG 38 inch ultra-wide (21x9) - Resolution: 3840x1600
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

johnny-s wrote on 11/26/2024, 9:25 AM

Update to my previous posts RE: fixed resolutions and limitations to allow split frame encoding.

Extract from following url ...
"With split encoding, an input frame is split and each part is processed in parallel by multiple NVENCs on the chip, which results in a significant speedup compared to sequential encoding. Split encoding was initially implemented by default on predetermined HQ and LL presets on 4K and 8K resolutions for HEVC and AV1.

In Video Codec SDK 12.1, you control the implementation of split encoding with a flag:

Disabling the flag results in the original default functionality, enabled on predetermined presets on 4K and 8K resolutions.
Enabling the flag implements split encoding across all resolutions and presets.
You can also control the split across two or three NVENCs depending on the number of NVENCs present on the GPU. For more information, see the GPU matrix.
"

https://developer.nvidia.com/blog/new-video-creation-and-streaming-features-accelerated-by-the-nvidia-video-codec-sdk/

 

A request was made to implement split frame encoding in Voukoder last October ...

https://www.voukoder.org/forum/thread/1478-suggestions-for-av1-nvenc-and-ffmpeg-improvements/

PC 1:

Intel i9-9900K

32 GB Ram

AMD Radeon XFX RX 7900 XT

Intel UHD 630

Win 10

PC 2:

AMD Ryzen 9 7950X3D 16 core CPU

64 GB Ram

Nvidia 4090 GPU

Intel A770 GPU

Win 11

 

Laptop:

Intel 11th. Gen 8 core CPU. i9-11900K

64 GB Ram

Nvidia RTX 3080 GPU

Win 10

Howard-Vigorita wrote on 11/26/2024, 10:56 AM

Interesting that Nvidia didn't start putting multiple encoders into the gpus till the 4000-series. I was just looking at my really old Amd gpus and I see multiple encoders going back as far as the VegaM embedded into the now-discontinued Intel i7-8809g cpu. Splitting frames is new to me, however. If it hurts quality and runs slower than Amd gpus, I don't see the point except maybe to render 8k frames with an encoder limited to 4k.

From the quotes, which are also new to me, it sounds like Nvidia might be pursuing the ability to team encoders in multiple gpus, like Intel does with Arcs... which Vegas does not yet support. At the moment, Vegas users cannot even choose between 2 gpus from the same manufacturer for renders. In fact, consumer motherboards are getting hard pressed to provide two x16 slots when they only have 24 or so pcie lanes to go around. Gotta wonder where Nvidia expects to go with that. Sounds like only expensive high-end Ryzen and Xeon workstations might benefit.

johnny-s wrote on 11/26/2024, 1:11 PM

Correction. I can render Nvenc AV1 to 10 bit 420.

PC 1:

Intel i9-9900K

32 GB Ram

AMD Radeon XFX RX 7900 XT

Intel UHD 630

Win 10

PC 2:

AMD Ryzen 9 7950X3D 16 core CPU

64 GB Ram

Nvidia 4090 GPU

Intel A770 GPU

Win 11

 

Laptop:

Intel 11th. Gen 8 core CPU. i9-11900K

64 GB Ram

Nvidia RTX 3080 GPU

Win 10

UltraVista wrote on 11/26/2024, 4:38 PM

but if dual encoders are sequentially alternated for single stream encoding, that offers also an advantage, it is better for physically spreading the heat generated in the chip, avoiding hot spots...

@bitman Yeah load balancing is useful, I have been transcoding hours of footage at the same time recording with OBS and encoding with Vegas all using Nvenc with no delays or stutters, the transcoder operates at full speed which slows everything down with a single encoder due to near 100% use of Nvenc but that's only 50% use of dual encoders.

Nvidia talk about a way of using up to 100% of a dual encoder by encoding in parallel, the example given is encoding every alternate GOP, so 1,3,5 while simultaneously encoding 2,4,6 and while that may work with Shutter Encoder or Handbrake I can't see how that can work with an NLE. The other problem that I already see with Split Encoding is running out of NVDEC, the bottleneck becomes the decoder. Nvidia would say you need a pro card that has 2 or 3 NVDEC. Will be interesting to see if there's any change to decode/encode with 50 series cards.

Howard-Vigorita wrote on 11/26/2024, 5:22 PM

I usually upload song-renders to YouTube made with MainConcept 4k hevc-vbr 28/56mbps 10-bit. Would really love to switch to Nvenc Av1. Voukoder and Resolve seem stuck in a similar 4k quality rut... odd that they both use ffmpeg7 libs and come nowhere close to transcoding with ffmpeg7 itself. The current Magix-Av1 is cpu-only making it pretty slow but the overall quality's pretty good and will likely get even better if they get rid of anomalies. A Magix-native Nvenc Av1 would do it for me. Here's the comparative quality analysis I get:

UltraVista wrote on 11/26/2024, 9:28 PM

@Howard-Vigorita I did a comparison of HEVC10 NVENC encodes from HEVC10 media. They all look pretty similar with the Vegas Native encode worst, I should have set VBR for Voukoder but instead used the 'Good Quality (Recommended)' setting which is a much higher bitrate. Vegas set to 32bit

The split encoding doesn't look to be any worse, first letter of file names is cut off for some reason.

EDIT: Vegas Native encode is not correct, it's 8bit, so they're all very similar quality wise

Fixed:

 

bitman wrote on 11/27/2024, 3:05 AM

Interesting that Nvidia didn't start putting multiple encoders into the gpus till the 4000-series.


@Howard-Vigorita That is absolutely not true. NVIDEA has been putting multiple encoders in the same chip in the past generations depending on the past generation and model (some generations like gen 7 - the 3000 series - only have one). I know because long ago I had one with 2 encoders. Here is an overview:

https://en.wikipedia.org/wiki/Nvidia_NVENC#:~:text=The%20GTX%201650%20Super%20uses,in%20the%20original%20GTX%201650.

APPS: VIDEO: VP 365 suite (VP 22 build 194) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K (upgraded my former CPU i9-12900K),
  • Air Cooler: Noctua NH-D15 G2 HBC (September 2024 upgrade from Noctua NH-D15s)
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: ASUS GeForce RTX 3090 TUF OC GAMING (24GB) 
  • Monitor: LG 38 inch ultra-wide (21x9) - Resolution: 3840x1600
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

Howard-Vigorita wrote on 11/27/2024, 10:07 AM

@bitman I have a 1660, a 1660ti, and a 3060 and they only show a single video encoder engine in Task Manager. My 4090 shows 2. My 3060 laptop shows this compared to my 4090:

What do you see on yours?

Howard-Vigorita wrote on 11/27/2024, 10:42 AM

@UltraVista If you use lossy compressed media as a reference, you will get deceptively high quality results because all the missing data elements raise the results by scoring those parts 100%... the more missing, the higher the resulting average score. And any effort by the encoder to synth, sharpen, or otherwise make up for missing detail will lower the score. Unfortunately the ffmpeg metrics algorithm does not provide a way of ignoring missing detail in the source media. It might not even be possible to accurately identify them. Which is why I only rely on lossless reference from a camera that captures unadulterated raw. I think an rgb lossless 4k screen capture of output from a high-res media generator might also be valid, possibly even more detailed, but I've never tried that myself.

johnny-s wrote on 11/27/2024, 11:12 AM

Meant to confirm earlier that all 3 AV1 encodes for Nvenc, Intel and AMD can output 10 bit 420, not just Nvidia.

PC 1:

Intel i9-9900K

32 GB Ram

AMD Radeon XFX RX 7900 XT

Intel UHD 630

Win 10

PC 2:

AMD Ryzen 9 7950X3D 16 core CPU

64 GB Ram

Nvidia 4090 GPU

Intel A770 GPU

Win 11

 

Laptop:

Intel 11th. Gen 8 core CPU. i9-11900K

64 GB Ram

Nvidia RTX 3080 GPU

Win 10

Howard-Vigorita wrote on 11/27/2024, 11:40 AM

@johnny-s It's always bugged me that my 6900xt and earlier can only render 8-bit... another plus for the 7000-series. Are av1 and hevc both 10-bit now?

johnny-s wrote on 11/27/2024, 11:48 AM

Nvidia and Intel hevc are indeed 10 bit 420.

Amd hevc appears to be only 8 bit 420.

I am going to check Amd, I'll get back, it should be 10 bit also.

Last changed by johnny-s on 11/27/2024, 11:54 AM, changed a total of 1 times.

PC 1:

Intel i9-9900K

32 GB Ram

AMD Radeon XFX RX 7900 XT

Intel UHD 630

Win 10

PC 2:

AMD Ryzen 9 7950X3D 16 core CPU

64 GB Ram

Nvidia 4090 GPU

Intel A770 GPU

Win 11

 

Laptop:

Intel 11th. Gen 8 core CPU. i9-11900K

64 GB Ram

Nvidia RTX 3080 GPU

Win 10

johnny-s wrote on 11/27/2024, 11:56 AM

Amd hevc is also 10 bit 420.

PC 1:

Intel i9-9900K

32 GB Ram

AMD Radeon XFX RX 7900 XT

Intel UHD 630

Win 10

PC 2:

AMD Ryzen 9 7950X3D 16 core CPU

64 GB Ram

Nvidia 4090 GPU

Intel A770 GPU

Win 11

 

Laptop:

Intel 11th. Gen 8 core CPU. i9-11900K

64 GB Ram

Nvidia RTX 3080 GPU

Win 10

UltraVista wrote on 11/27/2024, 4:37 PM

@UltraVista If you use lossy compressed media as a reference, you will get deceptively high quality results because all the missing data elements raise the results by scoring those parts 100%... the more missing, the higher the resulting average score. And any effort by the encoder to synth, sharpen, or otherwise make up for missing detail will lower the score.

@Howard-Vigorita I think that's the point of the negative offset VMAF model that we're both using, it filters out the encoding enhancements, without the negative offset the VMAF would probably all score 100 and not useful at all , but neg results all pretty similar, possibly voukoder via Resolve would still give best results if bitrate the same. So the way you're doing this testing is you never expect a very high score because you are losing data encoding to 420 10bit, but you can still rank how well they do relatively speaking.

@bitman It's unlikely your 16 series GPU has 2 encoders because the 20 series did not have 2 encoders, the last Nvidia GPU I remember having dual Nvenc is 1080ti, and although officially the 1080 did not have dual encoders some did because at one point the 1080's were derived from flawed 1080ti silicon and the encoders were not disabled.

 

 

 

UltraVista wrote on 11/27/2024, 6:34 PM

@Howard-Vigorita I used a a Prores Raw file converted to Prores4444 the results are the same, they're all very similar. Maybe it's a Resolve bug to do with decoding your specific file, but your results are not indicative of the average results a person would get using Resolve and the hardware encoder, and increasingly doesn't look to be related to encoding at all.

(AV1 NVENC)

It seems like I don't know how to use the VBR setting in Voukoder, I added 60mbit as encoding bitrate and 120mbit maximum, but Voukoder doesn't adhere to that, you had no problem. The Resolve NVENC AV1 encoder is missing metadata info such as it being 10bit, I wonder how ffmetrics handles that.

UltraVista wrote on 11/27/2024, 7:51 PM

I tried the file you're testing with, it gave expected results, so not a decoding problem, must be a problem with your settings or your system, if you're using the intel decoder, try changing to Nvidia decoder.

johnny-s wrote on 11/27/2024, 9:57 PM

"If you use lossy compressed media as a reference, you will get deceptively high quality results because all the missing data elements raise the results by scoring those parts 100%"

What I do is I do the RQM using the lossy source as reference.

I then transcode the source to a lossless file, say MagicYuv. I then use this as "ceiling".

I then put the lossless file into a spreadsheet with the files to compare, not the lossy source.

That way I can display the % difference of rendered out files to a 100% ceiling.

For example: This is the result of discussion I had in another thread and used the "Kimberly" lossy Voukoder converted clip as reference to compare CPU/Nvenc etc with Prores HQ thrown in.

Not saying to do this, but it gives me what I want.

Last changed by johnny-s on 11/27/2024, 10:10 PM, changed a total of 3 times.

PC 1:

Intel i9-9900K

32 GB Ram

AMD Radeon XFX RX 7900 XT

Intel UHD 630

Win 10

PC 2:

AMD Ryzen 9 7950X3D 16 core CPU

64 GB Ram

Nvidia 4090 GPU

Intel A770 GPU

Win 11

 

Laptop:

Intel 11th. Gen 8 core CPU. i9-11900K

64 GB Ram

Nvidia RTX 3080 GPU

Win 10

bitman wrote on 11/28/2024, 8:08 AM

@bitman I have a 1660, a 1660ti, and a 3060 and they only show a single video encoder engine in Task Manager. My 4090 shows 2. My 3060 laptop shows this compared to my 4090:

What do you see on yours?


@Howard-Vigorita obviously only one encoder, as I said the 3000 series only have one.

I am eyeballing a 4090 when prices may drop due to the 5000 release, but I do not want to spend over 2K Euro for a 4090.

But I had a 2 encoder video card a few years back around 2017, as @UltraVista correctly mentioned I had a GTX 1080Ti (pascal- GP102, 4the gen).

FYI, I have owned in the past in descending order RTX 2080Ti (Turing, gen 6), GTX 1080Ti (pascal, gen 4), GTX TITAN (kepler, gen 1), and before that a dual Radeon setup, and before that I cant remember out of my head...

APPS: VIDEO: VP 365 suite (VP 22 build 194) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K (upgraded my former CPU i9-12900K),
  • Air Cooler: Noctua NH-D15 G2 HBC (September 2024 upgrade from Noctua NH-D15s)
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: ASUS GeForce RTX 3090 TUF OC GAMING (24GB) 
  • Monitor: LG 38 inch ultra-wide (21x9) - Resolution: 3840x1600
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

johnny-s wrote on 11/29/2024, 1:21 PM

@bitman Once the 5090 is released there should be a few bargains.

PC 1:

Intel i9-9900K

32 GB Ram

AMD Radeon XFX RX 7900 XT

Intel UHD 630

Win 10

PC 2:

AMD Ryzen 9 7950X3D 16 core CPU

64 GB Ram

Nvidia 4090 GPU

Intel A770 GPU

Win 11

 

Laptop:

Intel 11th. Gen 8 core CPU. i9-11900K

64 GB Ram

Nvidia RTX 3080 GPU

Win 10

Howard-Vigorita wrote on 11/30/2024, 12:54 AM

... So the way you're doing this testing is you never expect a very high score because you are losing data encoding to 420 10bit, but you can still rank how well they do relatively speaking.

@UltraVista Ha, ha, big negative on that. All my 4k cameras have sensors stripped for 4:2:0 and can only create 4:2:2 or 4:4:4 images by adding data not present in the original raw capture. Which will likely reduce an accuracy measurement. I think ffmpeg does an internal upscale with double-precession math and 64-bit storage before making the comparison calculations. I don't think Vegas can equal that precision so any sub-sampling upscale it does will lower scores. Resolve's math and internal storage widths seem to be lower than Vegas' which is consistent with its lower scores and better performance.

Btw, 4k vmaf scores come out the same with or without neg jsons. Only seems to make a difference with hd. The 4k jsons aren't normally distributed. I had to hunt them down to figure out why.

If you want to see the difference between hardware decoders, check out some of the older charts in my signature. I shoot mostly hevc and found Vegas' older legacy-hevc the highest scoring... which I believe is a hybrid decoder combining cpu-based Intel libs plus gpu for math calculations. Vp22 default hevc decoding yields to the same high-quality output as the old legacy-hevc hybrid decoder with better performance. While the new experimental-hevc seems to be the old default of handing off decoding completely to the gpu yielding lower quality and better performance.

UltraVista wrote on 11/30/2024, 1:16 AM

@Howard-Vigorita Are you using the free version of Resolve, I think I always assumed you had studio like a few others here, I wonder if your lower quality encoding results are somehow the cause of software decoding?

But interestingly I did see that AV1 10bit NVENC is superior to HEVC 10bit NVENC just as you head said. I only used AV1 for screen capture because I thought that where it excels at the lower bitrates I was using but where possible may do final encodes with AV1 now. Vegas decodes AV1 really well

That's interesting about the hybrid decoding of old HEVC decoder and similar/same experimental decoder. It's problem I recall is lack of multicore CPU use used in association with GPU decode. If it could be made multicore experimental decoder may evolve to be something more useful other than just legacy.

Nizar-AbuZayyad wrote on 11/30/2024, 1:43 AM

While VEGAS Pro's official system requirements outline the necessary specifications for optimal performance, most of us with superior hardware configurations and way better Specs than what is mentioned on vegas website, still experience issues such as lag and VRAM limitations.. etc This indicates that, despite exceeding the recommended specs, certain performance challenges persist.