Some FX tests on Arc, GeForce and Radeon

Deathspawner wrote on 3/13/2023, 11:52 AM

Hi all:

I spent part of the weekend researching how some of the FX in VP20 work on different GPU vendors, and while I will eventually tie this in with an article at Techgage, I wanted to share the info here first, since it does raise a number of interesting angles. I tested AMD's Radeon RX 6600, Intel's Arc A770, and NVIDIA's GeForce RTX 3060 - all roughly the same SRP - with the latest available drivers. Sorry for the subpar formatting here:

(GPU Usage % / CPU Usage %)

4K/60 AVC to AVC (VCE/QuickSync/NVENC Transcode)
    AMD 6600: 1m 40s (39% / 15%)
    Intel 770: 1m 33s (92% / 34%)
    NVIDIA 3060: 3m 14s (30% / 17%)

4K/60 AVC to HEVC (VCE/QuickSync/NVENC Transcode)
    AMD 6600: 1m 50 (39% / 15%)
    Intel 770: 1m 25s (91% / 35%)
    NVIDIA 3060: 4m 27s (23% / 15%)

Add Noise FX
    AMD 6600: 28s (52% / 17%)
    Intel 770: 33s (83% / 21%)
    NVIDIA 3060: 41s (31% / 22%)

Black Bar Fill FX
    AMD 6600: 1m 16s (74% / 8%)
    Intel 770: 1m 6s (84% / 34%)
    NVIDIA 3060: 3m 32s (83% / 28%)

Bump Map FX
    AMD 6600: 28s (41% / 23%)
    Intel 770: 35s (84% / 23%)
    NVIDIA 3060: 1m 6s (20% / 8%)

Gaussian Blur FX
    AMD 6600: 36s (62% / 11%)
    Intel 770: 33s (82% / 16%)
    NVIDIA 3060: 53s (33% / 18%)

Linear Blur FX
    AMD 6600: 29s (61% / 13%)
    Intel 770: 33s (81% / 25%)
    NVIDIA 3060: 54s (43% / 15%)

Median FX
    AMD 6600: 1m 21s (91% / 5%)
    Intel 770: 1m 20s (96% / 17%)
    NVIDIA 3060: 2m 41s (75% / 30%)

Min and Max FX
    AMD 6600: 2m 27s (92% / 6%)
    Intel 770: 2m 40s (95% / 21%)
    NVIDIA 3060: 3m 28s (93% / 47%)

Pixelate FX
    AMD 6600: 25s (58% / 18%)
    Intel 770: 30s (81% / 20%)
    NVIDIA 3060: 50s (44% / 18%)

Spherize FX
    AMD 6600: 24s (51% / 18%)
    Intel 770: 32s (79% / 15%)
    NVIDIA 3060: 46s (26% / 12%)

PROBLEMATIC:

Defocus FX
    AMD 6600: 7m 51s (4% / 87%)
    Intel 770: 1m 16s (91% / 32%)
    NVIDIA 3060: 2m 18s (73% / 22%)

Newsprint FX
    AMD 6600: 37s (56% / 10%)
    Intel 770: Causes PC to become unusable
    NVIDIA 3060: 1m 8s (35% / 11%)

Starburst FX
    AMD 6600: 30m 27s (2% / 90%)
    Intel 770: 1m 30s (85% / 42%)
    NVIDIA 3060: 3m 0s (77% / 21%)

Key takeaways I immediately see is that AMD and Intel are almost just as performant as one another - both trade blows depending on the test. Unfortunately, Intel couldn't use the Newsprint FX without lagging the PC to the point it needs a hard reboot, and AMD's GPU wasn't utilized at all in Defocus or Starburst, which means they took ages to run because they were relegated to the CPU.

Meanwhile, NVIDIA was the slowest of the bunch, but it seems that's primarily due to the fact that encoding even without FX is so much slower than the others (the first two results highlight that). I plan to test encoding with Voukoder when I can, but for now just wanted to stick to the built-in encoder. It could be that Voukoder may be much kinder to GeForce.

When I compared the output quality from each encode, none of them stood out as being any different from the others. There have been times in the past where one vendor's output would be broken somewhere, but I couldn't spot issues here.

While all of these tested FX perform differently from vendor to vendor when using equal GPUs, it's really hard to tell right now which ones would actually benefit from faster models, in the same way faster GPUs definitively speed up 3D rendering. There are few FX that use all three GPU vendors to great effect. Median and Min and Max seem to come closest.

 

Comments

Howard-Vigorita wrote on 3/13/2023, 12:08 PM

Any chance we could get the sample clips used? Or maybe the mediainfo.

Deathspawner wrote on 3/13/2023, 12:12 PM

Any chance we could get the sample clips used? Or maybe the mediainfo.

All of the sources used for these FX tests are similar to this:

General
Complete name                            : \\tg-ds1019\TGBench\Windows\Development\Sources\Assets\Shared\Encoding\MAGIX VEGAS Pro\Sources\Cologne - Lit Bridge At Night.mp4
Format                                   : MPEG-4
Format profile                           : Base Media / Version 2
Codec ID                                 : mp42 (isom/mp42)
File size                                : 355 MiB
Duration                                 : 24 s 649 ms
Overall bit rate                         : 121 Mb/s
Encoded date                             : UTC 2018-08-21 19:33:32
Tagged date                              : UTC 2018-08-21 19:33:32

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L5.2
Format settings                          : CABAC / 1 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 1 frame
Format settings, GOP                     : M=1, N=30
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 24 s 649 ms
Source duration                          : 24 s 647 ms
Bit rate                                 : 120 Mb/s
Width                                    : 3 840 pixels
Height                                   : 2 160 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Variable
Frame rate                               : 59.560 FPS
Minimum frame rate                       : 29.821 FPS
Maximum frame rate                       : 59.682 FPS
Real frame rate                          : 60.000 FPS
Standard                                 : NTSC
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.244
Stream size                              : 354 MiB (100%)
Source stream size                       : 354 MiB (100%)
Title                                    : VideoHandle
Language                                 : English
Encoded date                             : UTC 2018-08-21 19:33:32
Tagged date                              : UTC 2018-08-21 19:33:32
Color range                              : Limited
colour_range_Original                    : Full
Color primaries                          : BT.2020
colour_primaries_Original                : BT.601 PAL
Transfer characteristics                 : BT.709
transfer_characteristics_Original        : BT.601
Matrix coefficients                      : BT.2020 non-constant
matrix_coefficients_Original             : BT.470 System B/G
mdhd_Duration                            : 24649
Codec configuration box                  : avcC

Edit: Actually, I should ask... which source material would be _best_ for any of these encode tests? I figure straight from a smartphone will be a common usage, so I've been using that footage a while. If I can get some playback tests going again, I'll expand beyond those.

Todd-A0 wrote on 3/13/2023, 5:21 PM

Edit: Actually, I should ask... which source material would be _best_ for any of these encode tests? I figure straight from a smartphone will be a common usage, so I've been using that footage a while

@Deathspawner VFR can cause a slowdown in Vegas, it is best to use CFR AVC video if you're using Vegas as a GPU benchmark tool, and it's not a review of Vegas. The slowdown is not an inefficient use of GPU/CPU(In the examples I've seen) where the resource values stay high, and the encode speed is low, but instead Vegas uses less CPU and GPU, and so less hardware encoder. As best I can tell I don't see an obvious slow down with your benchmarks

Meanwhile, NVIDIA was the slowest of the bunch, but it seems that's primarily due to the fact that encoding even without FX is so much slower than the others

You could try encoding to a low resource CPU codec such as mpeg2 or maybe prores at 1080P or even 720P. The idea is to reduce the CPU encoder overhead as much as possible . That should give a more accurate comparison of GPU processing, assuming you're swapping out the GPU in the same computer. This is starting to get convoluted but even if you never published those figures it would help you understand which GPU is fastest at video processing and how much the NVENC pause(every 60frames) skews results.

RogerS wrote on 3/13/2023, 7:20 PM

I wonder if this is more revealing decoding issues with Nvidia cards in Vegas. Could you try CFR AVC from a camera?

Custom PC (2022) Intel i5-13600K with UHD 770 iGPU with 31.0.101.4091 driver, MSI z690 Tomahawk motherboard, 64GB Corsair DDR5 5200 ram, NVIDIA 2080 Super (8GB) with latest studio driver, 2TB Hynix P41 SSD, Windows 11 Pro 64 bit

Dell XPS 15 laptop (2017) 32GB ram, NVIDIA 1050 (4GB) with latest studio driver, Intel i7-7700HQ with Intel 630 iGPU (driver 31.0.101.2115), dual internal SSD (256GB; 1TB), Windows 10 64 bit

Vegas 19.648
Vegas 20.270

VEGAS 4K "sample project" benchmark: https://forms.gle/ypyrrbUghEiaf2aC7
VEGAS Pro 20 "Ad" benchmark: https://forms.gle/eErJTR87K2bbJc4Q7

Howard-Vigorita wrote on 3/14/2023, 10:59 AM
Edit: Actually, I should ask... which source material would be _best_ for any of these encode tests? I figure straight from a smartphone will be a common usage, so I've been using that footage a while. If I can get some playback tests going again, I'll expand beyond those.

8-bit 4:2:0 avc with n=30 is fairly easy to decode which shouldn't overshadow encoding or fx processing performance. Intra avc might be a little easier decoding and have a smaller impact distinguishing Intel cpus w/igpu decoding from systems without that. ProRes 422 is also pretty easy to decode but strictly cpu on typical Windows systems.

If you want to focus on decoding performance, more difficult but compact accelerated 10-bit 4:2:0 hevc is common to most gpus. Whereas 4:2:2 hevc is only available on late-model Intel igpus and Arc. Really relevant only to users like myself who shoot longer sessions on space-limited media and select their cameras and workstation components accordingly.

Wolfgang S. wrote on 3/14/2023, 11:13 AM

 

If you want to focus on decoding performance, more difficult but compact accelerated 10-bit 4:2:0 hevc is common to most gpus. Whereas 4:2:2 hevc is only available on late-model Intel igpus and Arc. Really relevant only to users like myself who shoot longer sessions on space-limited media and select their cameras and workstation components accordingly.

That is a fundamental decision. Frankly spoken, I never would be happy to shoot in HEVC, since I know that this footage is so hard to encode. If you know that, you could easily decide to use H.264 with long-GOP or even all-I. This will bring you significantly more reserves in terms of playback behaviour, but also tough operations like the 32bit floating point mode together with ACES.

So from that side, using different type of footages combined with different project settings would be the best case.

Render time is only one part, maybe because it is so easy to measure. But playback behaviour is the more important part, since you have to edit the footage - what will require playback reserves that are high enough. But that is not soo simple to measure.

Howard-Vigorita wrote on 3/14/2023, 1:33 PM
I never would be happy to shoot in HEVC, since I know that this footage is so hard to encode. If you know that, you could easily decide to use H.264 with long-GOP or even all-I.

@Wolfgang S. I could not have made the transition to a 4k workflow without hevc. Going from a 50 mbps HD workflow to 4k avc would have increased my media size and speed requirements 4x which would not have been practical for my 3 to 5-hour shoots. Hevc let me cut that in half while increasing edit/render performance and quality. But it did require carefully testing camera footage and hardware before making the move.

I've never had any luck editing Long-Gop. None of the hardware I've tested likes it. My 4k cameras shoot hevc at n=15 which is good. When I work with others, their all-I (intra, n=1) is the easiest going but clip sizes are a little bit larger. ProRes, btw, is All-I and it's clips are a lot larger.