Vegas 18 GPU render - what are the performance limiters?

dxdy wrote on 8/17/2020, 3:13 PM

I know this topic has been flogged again and again, but I would like to understand  what I am seeing in Vegas 18 render performance.


I built a new machine with an AMD R9 3900 (12 cores, 24 threads) with 16GB RAM, and NVIDIA 2060 Super (8 GB RAM), after my five year old i7-5960 died.


I found a copy of the "Red Car" press release .veg and media files, dated in 2014.  The programs and OS are on an SSD. Source files on a 7200 RPM HDD, output to another SSD.


Rendering with the MAGIX AVC/AAC MP4 - INTERNET HD 1080P 29.97  (with no GPU), it completed in 46 seconds. On the Windows performance monitor I saw what I expected, mid-90 percent CPU usage, little GPU usage, very low SSD and HDD usage. It hadn't let me set a bit rate, it was using the MainConcept AVC encode mode, and the bit rate came out to 3.0Mbit. Playing it back, it looks very nice.

Rendering with the MAGIX AVC/AAC MP4 - INTERNET HD 1080P 29.97 NVIDIA NVENC @ 12Mbit/sec with GPU, it completed in 19 seconds. On the performance monitor I did not see what I expected: CPU usage in the mid 40s (40 - 50 percent), and GPU in the low 20% range. I'd set the variable rate to 12M - 24M, it came out 11.3M bitrate (per Mediainfo).

I have searched through the Vegas forum for the Red Car results (someone had a table of values for various CPUs), and it seems to me that both these results are pretty quick. Is it still out there? Anyone have a copy?

What I don't understand is why in the GPU render, one or the other of the resources isn't near 100%. The HDD and SSDs did not appear to be working very hard. AFAIK the only remaining constraint could be DRAM speed? or the channel between DRAM and CPU? I think this setup is speedy, and I should be looking just at the elapsed rendering times (and quality), but I am curious. Perhaps a memory upgrade?

I was using all the factory settings in the preferences, including 200 for Dynamic RAM preview.

Comments

j-v wrote on 8/17/2020, 3:40 PM

The limiters are partial what you tell here,using probably FHD and render to FHD.
If you should use heavy 4K sources and render to 4 K AVC or HEVC maybe you see the difference in performance, but you did not tell us that.

met vriendelijke groet
Marten

Camera : Pan X900, GoPro Hero7 Hero Black, DJI Osmo Pocket, Samsung Galaxy A8
Desktop :MB Gigabyte Z390M, W11 home version 24H2, i7 9700 4.7Ghz,16 DDR4 GB RAM, Gef. GTX 1660 Ti with driver
566.14 Studiodriver and Intel HD graphics 630 with driver 31.0.101.2130
Laptop  :Asus ROG Str G712L, W11 home version 23H2, CPU i7-10875H, 16 GB RAM, NVIDIA GeForce RTX 2070 with Studiodriver 576.02 and Intel UHD Graphics 630 with driver 31.0.101.2130
Vegas software: VP 10 to 22 and VMS(pl) 10,12 to 17.
TV      :LG 4K 55EG960V

My slogan is: BE OR BECOME A STEM CELL DONOR!!! (because it saved my life in 2016)

 

Former user wrote on 8/17/2020, 5:47 PM


Rendering with the MAGIX AVC/AAC MP4 - INTERNET HD 1080P 29.97  (with no GPU), it completed in 46 seconds. On the Windows performance monitor I saw what I expected, mid-90 percent CPU usage, little GPU usage, very low SSD and HDD usage. It hadn't let me set a bit rate, it was using the MainConcept AVC encode mode, and the bit rate came out to 3.0Mbit. Playing it back, it looks very nice.

Rendering with the MAGIX AVC/AAC MP4 - INTERNET HD 1080P 29.97 NVIDIA NVENC @ 12Mbit/sec with GPU, it completed in 19 seconds. On the performance monitor I did not see what I expected: CPU usage in the mid 40s (40 - 50 percent), and GPU in the low 20% range. I'd set the variable rate to 12M - 24M, it came out 11.3M bitrate (per Mediainfo).

What I don't understand is why in the GPU render, one or the other of the resources isn't near 100%.

GPU encoders are good at benchmarking NLE's because the software no longer is limited by it's encoding speed (In case of NVENC) The slowness is with rendering(creating) the actual frames, and the NVENC can encode at hundreds of fps at 1080p so it's not the limiting factor

What you see is an inefficiency in the vegas rendering engine. It's only capable of rendering at that speed, and has no use for your remaining 55% cpu 80% GPU. Your gpu can only process at the rate it is sent the frames both for processing and encoding. What could explain this? Latency can, if the cpu is doing nothing for much of the time. You can see what I mean when you use voukoder. It removes some of the latency, your encoding is faster, and your cpu and gpu use goes up.

So you would want to decrease latency. Setting gpu preview to 0 increases latency by a small amount, so if possible keep that at default. Not sure if there's much you can do. Could be an inefficiency at using all those threads. Are they being used equally when you check task manager with GPU encoding?

dxdy wrote on 8/17/2020, 7:17 PM

The limiters are partial what you tell here,using probably FHD and render to FHD.
If you should use heavy 4K sources and render to 4 K AVC or HEVC maybe you see the difference in performance, but you did not tell us that.

 

peter-d wrote on 8/17/2020, 7:19 PM

Your gpu can only process at the rate it is sent the frames both for processing and encoding.

I was planning to ask an unrelated question, and found your comment!

Is this why Plug-ins that use more cpu or gpu resource than a PC can provide, will sometimes render successfully to tiff image?

Extending on this: Would a tiff sequence saved to tiff sequence have more time/resources for some plugins?

Extending further: Would a video with divided fps also benefit from more time/resources for some plugins?

I ask after finding video saved to tiff sequence with plug-ins that normally fail due to PC specs.

Hope this makes sense.

dxdy wrote on 8/17/2020, 7:23 PM

Thank you for the information. I have tried a busy 4K clip, rendering to 4k as suggested, and the performance is similar. I noticed both the CPU and GPU loading had sawtooth profiles - perhaps the bottom of the sawtooth on the loading graph shows waiting for the other hardware. This would imply that the passing of data from the GPU to the CPU and the back from CPU to GPU is taking a lot of time. My motherboard is a ASUS TUF GAMING X570-PLUS (WIFI), the box says PCIe 4.0 ready.

fifonik wrote on 8/17/2020, 8:23 PM
Rendering with the MAGIX AVC/AAC MP4 - INTERNET HD 1080P 29.97 NVIDIA NVENC @ 12Mbit/sec with GPU, it completed in 19 seconds. On the performance monitor I did not see what I expected: CPU usage in the mid 40s (40 - 50 percent), and GPU in the low 20% range. I'd set the variable rate to 12M - 24M, it came out 11.3M bitrate (per Mediainfo).

It is possible that you are not looking to the right place.
In your case the limiting factor could be GPU Encoder that is not displayed by default in W10 Performance monitor.

If you are checking "Video Processing", it only shows some load used by VP's internal processing.

If you checking 'GPU Activity' (on the left), the W10 is averaging GPU load somehow. It might show you as load is not high enough while GPU parts that is in use (HW Encoder for example) are actually 100% loaded.

For some footage bottleneck could be in GPU Decoder or CPU (if the video stream cannot be decoded using GPU decoder).

Data transfer is usually not limiting factor.

Last changed by fifonik on 8/17/2020, 8:25 PM, changed a total of 1 times.

Camcorder: Panasonic X1500 + Panasonic X920 + GoPro Hero 11 Black

Desktop: MB: MSI B650P, CPU: AMD Ryzen 9700X, RAM: G'Skill 32 GB DDR5@6000, Graphics card: MSI RX6600 8GB, SSD: Samsung 970 Evo+ 1TB (NVMe, OS), HDD WD 4TB, HDD Toshiba 4TB, OS: Windows 10 Pro 22H2

NLE: Vegas Pro [Edit] 11, 12, 13, 15, 17, 18, 19, 22

Author of FFMetrics and FFBitrateViewer

fr0sty wrote on 8/18/2020, 1:24 AM

^What he said. There are 2 components to your GPU that do the processing here. One is using the GPU's compute cores... the main muscle they have, to help render the frames on the timeline. From there, there's a special chip on your GPU that does nothing but encode video, so the compute part of the GPU helps finish rendering the frame, then once the CPU and GPU have a frame, they pass it on to the encoder chip to finish processing. It may not require 100% of your GPU power to render the frame (there may only be so many effects used that can even be GPU accelerated), but once the frame is finished rendering, and is sent over to the encoder chip, it then runs at full capacity.

However, I can't say I agree with the above statement about using voukoder, as it too must wait for the frame to be rendered before it is able to do anything with it, so any performance gains there would simply be more efficient use of the encoder chip (or the latency holding VEGAS back from rendering frames faster actually happens after the frame is rendered, somehow).

Systems:

Desktop

AMD Ryzen 7 1800x 8 core 16 thread at stock speed

64GB 3000mhz DDR4

Geforce RTX 3090

Windows 10

Laptop:

ASUS Zenbook Pro Duo 32GB (9980HK CPU, RTX 2060 GPU, dual 4K touch screens, main one OLED HDR)

Howard-Vigorita wrote on 8/18/2020, 11:26 AM


I found a copy of the "Red Car" press release .veg and media files, dated in 2014.  The programs and OS are on an SSD. Source files on a 7200 RPM HDD, output to another SSD.

...

I have searched through the Vegas forum for the Red Car results (someone had a table of values for various CPUs), and it seems to me that both these results are pretty quick. Is it still out there? Anyone have a copy?

Links to the latest online chart and a google drive with the bench zips can be found in this thread.