Cores and threads?

vicmilt wrote on 9/1/2007, 5:46 AM
Hello to all my technical buddies...

I have been having troubles with long renders as detailed in another thread.

My intial render times for one hour ten minutes of HDV m2t - plus - some cineform 1080i intermediates -was 6+ hours to NTSC widescreen DV MPEG2 for DVD Architect, as defaulted in Vegas 7d.

In researching this forum I found two changes which I applied:
1 - changed render from on SATA Raid in/out to using two SATA drives - one for source and one for render
2 - changed "threads" from 4 to 2

My question is about those threads.
I'm using a Q6600 quad core with 2GIG ram.
When I changed the threads from 4 to 2 the CPU utilization DROPPED from 87% to less than 50% - yet the render time also DROPPED from 6+ hours to 94 minutes.

Now "Logic" would lead me to assume that More Cores would utilize More Threads and further that Higher CPU utilization would yield Faster Renders.

Could somebody explain this whole thing to me (and hundreds of others), please?

v

Comments

DJPadre wrote on 9/1/2007, 7:11 AM
yes, i raised this question in teh rendertest post a lil earlier..

im yet to hear a good justification for how 2 cores can meet or even exceed the speed of 4..


RBartlett wrote on 9/1/2007, 9:07 AM
This will need further clarification from senior developers and those in contact with Sony engineering. Rendering a Vegas project appears to be very similar to how the migration of wilderbeast occurs across the serengeti.

The migration only finishes when the last beast has stopped running and grazes in it's final area. The size of the migration is impacted by the type of input and output codecs that are in operation, same for audio, MByte/sec requirements of the I/O formats, timeline filters, number (ie depth) of filters and RAM preview settings. Something akin to the delay caused by crocodile infested rivers enroute.

So watch for Dynamic RAM preview settings. Also, 50% CPU with quad core is still two cores running flat out. That isn't so bad. So having 4 cores running, for the type of project you are doing that would otherwise take the excessive number of hours is probably the result of the render (the beasts in migration) taking too wide a path and coming across too many backoffs (croc infested rivers) and the processing advantage is taking a tumble.

In my book there is nothing wrong with a quad core system that reaches 25% full load during an A/V render. Of course it would be nicer to see 98% and reduce the times by a factor of 4.

Try swapping your input and output files to DV. In which case it may be the current interfacing between Sony and MainConcept that is a bit sticky.

Vegas 7 has certainly been seen to work well with more cores, so the app is certainly threaded. However it may not be threaded to the right granularity in certain instances. Whatever this project consists of, it must be one of the mixes that isn't progressive with more cores. Although it could just be the impact of the Dynamic RAM preview setting, rendering in 'BEST' or the mixture of Sony and 3rd party treatments.

Sorry if the analogy didn't work for you. It must have been the smell of a neighbors barbecue that got me thinking about wilderbeast.... :)


P.S. I was once told by a friend that drinking alcohol was good for you, even in quantity. Due to the fact that the herd only moved as fast as the weakest members. So by the same argument, the brain would be free from the shoddy braincells that would surely be taken by the consumption sooner rather than later during that indulgence. Akin to nature's way of culling the weakest. If you put the pair of us together, the result wasn't larger than the sum of the two parts either!

PPS, my next thought would be to run up two copies of Vegas in 2-thread mode on this quad core system and see if you approach 100% CPU then. This way you know it is not a system bottleneck or a major codec inability. Although the same codec will probably be loaded into Vegas' private memory sets twice. Depends on how system-wide the resource being accessed is.
JohnnyRoy wrote on 9/1/2007, 9:43 AM
I am not seeing the same results but I’m not playing with different hard drives either. First you have to make sure that you are measuring what you "think" you are measuring. Did you render to completion? or was that a Vegas estimate and then you canceled? I ask because the beginning of many of our projects starts out with titles and other generated media which takes longer to render and not knowing how much of the project is like this, Vegas reports extremely long render times. Then it corrects them as it gets to the HDV media alone. However, it may have had the beginning cached the second time you ran it and so it reported a much lower time. To really measure consistently it helps to keep the test simple and render to completion. You should also only change one thing at a time (i.e., change the threads OR the hard drives but NOT both!)

There is also a correlation between RAM preview and threads going on here. This may be why people are seeing different results. It’s not only the number of threads that affects the timing. Finally threads does not equal CPU/cores. So there is a lot parameters that affect each other.

I just tried this little test in Vegas 7.0e. I selected one minute of HDV from the timeline that had no generated media but did have color correction so that each frame needs to be processed. Here is what I measured:

One Minute of HDV:

Threads=1 RAM Preview=0 Time=1:22
Threads=1 RAM Preview=255 Time=1:01 <== sweet spot
Threads=1 RAM Preview=511 Time=1:05
Threads=1 RAM Preview=1024 Time=1:14

Threads=2 RAM Preview=0 Time=1:22
Threads=2 RAM Preview=256 Time=0:44 <== sweet spot
Threads=2 RAM Preview=511 Time=0:46
Threads=2 RAM Preview=1024 Time=0:51

Threads=4 RAM Preview=0 Time=1:22
Threads=4 RAM Preview=256 Time=0:46
Threads=4 RAM Preview=511 Time=0:45 <== sweet spot
Threads=4 RAM Preview=1024 Time=0:49

What you can see is that more isn’t always better (but zero is definitely not good!). There is a point where throwing more resource at the problem doesn’t increase anything and may actually degrade things. Something interesting that I observed was that when I changed the RAM preview to a setting that was too high to make a difference Vegas warned me. Also the point at which Vegas warns about allocating too much memory to RAM preview changed as the number of threads changed! When I had 1 thread it warned me at 256. When I had 2 or more threads it warned me at 512. It’s like Vegas is saying that based on the number of threads, it can only use so much preview memory.

I also noticed that CPU usage at 1 thread was around 39% while CPU usage at 2 threads was 88%. So all 4 CPU’s were being used with only 2 threads. So threads does not equal CPU’s used. (btw, 4 threads brought the CPU up to 90%) What might be happening is that a setting of 4 threads gives Vegas the permission to use up to 4 threads but it might not if it doesn’t have 4 things to do at once. Also some FX are not multi-threaded which would stop Vegas from using more threads. This may account for more threads not equaling faster render.

Finally, I monitored Page File usage and found it directly proportionate to the RAM Preview amount. When it was set to 0 the PF hardly increased, when set to 512 the PF jumped about a half gig and at 1024 it took an additional 1GB of page file. So RAM preview affects page file usage which affects performance as well depending on how fast your page file drive is.

Are you totally confused yet? This may not answer your question but it shows that it is not as simple as threads = CPU’s used.

~jr
RBartlett wrote on 9/1/2007, 10:02 AM
Very interesting and thanks JohnnyRoy. Some aspects of the CPU utilization are surely down to what Vegas does with it's render engine. A programmer rarely does things all in-line so I'd imagine certain situations are handled where the codec is shunted on to the least recently used CPU. Yet is tethered by what it is being fed by Vegas's supervising rendering engine itself.

When it comes down to it this is a plumbing exercise. You either increase the pressure or put in more pipes. Intel tell us that we can't go over 4GHz for now. IBM tell us that we could if Intel licensed IBMs silicon on insulator patent. Until then, PS3 and XBOX 360 type platforms are the only beneficiaries.

What you've pointed out here is the opposite argument. That more cores are handled well and that utilization overall benefits and arguably the number of cores becomes more important than the sheer clock speed.

This is where I was coming from on another thread when I suggested that one could have both nearly 4GHz CPUs and quad-core (possibly 8 core on Xeon, but overclocking handles are usually removed from premium computers - just one of those weirdnesses like having a limiter/governor on a performance car at 155MPH).

Sorry for the sliding off-topic.

The thread parameter clearly has an impact on performance. That is good enough to know. The trouble we have at the moment is that some projects are better with small numbers and others are better with high numbers of 'thread'. Where the tester has shown how impacted, sometimes drastically for those with results that differ from yours on the published rendertest we're drawn towards using after visiting these forums.


So the recommendation is probably to start your render, but if you go many times over the length of the program itself then perhaps you ought to tweak the settings for next time. I typically render when I'm not at my machine, so this matters very little to my way of working. I'm always interested in ways to keep the FPS at 25/30 on the preview monitor.

I'm not sure how much caching ever goes on with Vegas. It isn't quite it's rendering paradigm to pre-calculate items. Although I'm sure that there is some of this going on in there somewhere.
vicmilt wrote on 9/1/2007, 6:49 PM
I "knew" it was something...

|:>)

v
blink3times wrote on 9/1/2007, 8:38 PM
m yet to hear a good justification for how 2 cores can meet or even exceed the speed of 4..
========================================================
This whole thing is quite interesting, because if you look at the render test thread, you will see that the Q6600 quad is pretty much twice as fast as the E6600 dual core.... same cpu... one just has twice the cores.
farss wrote on 9/1/2007, 11:22 PM
No one seems to have mentioned this in light of the fact that you're rendering from CF, not HDV. Several some time ago did complain that CF performance declined in V7. Regardless though I'd be checking on the CF website to ensure you have th latest build and if that doesn't help raise it on one of their fora.
For a simple test compare the times rendering from the original HDV and rendering from the CF version.

Bob.
tfc wrote on 9/3/2007, 5:13 PM
O.K., now I'm totally confused. Can someone give some guidelines on how many threads to use for rendering, something that does not involve mathematics? Here is what I have been able to gather from this:

1) Number of threads has no relation to number of CPU cores.
2) Increasing the number of threads, even with a multi-core machine, may actually increase rendering times in some instances. (What are they?)
3) There is absolutely no way to know, no guidelines whatsoever, on how many threads to put in the preferences for your particular machine, other than doing an empirical analysis of many different variables.

Are all my assumptions correct?


Another question - do the number of threads necessarily equal the number of f/x's one places on the timeline? Is there any correlation of the number of threads to ANTHING?

Can someone please give some guidance on the threads issue?
JohnnyRoy wrote on 9/3/2007, 6:46 PM
> Can someone please give some guidance on the threads issue?

The best general guidance that I can give is to use the Vegas defaults. In all of the testing that I did, the defaults always gave me the best rendering times. On the Options / Preferences / Video tab press the Default All button at the bottom and stick with that. Vegas will take into account the number of cores and amount of memory you have and make an optimal selection.

~jr
xberk wrote on 9/3/2007, 7:29 PM
Don't know the technical stuff on this issue -- but I do think that CPU cores and threads are related .. at least on my machine .. .. I just ran John Cline's rendertest.veg with my Q6600 quad core set to 2 threads. It ran in about 4 min whereas it runs in 2 min 8 seconds when the preferences are set for 4 threads.

Paul B .. PCI Express Video Card: EVGA VCX 10G-P5-3885-KL GeForce RTX 3080 XC3 ULTRA ,,  Intel Core i9-11900K Desktop Processor ,,  MSI Z590-A PRO Desktop Motherboard LGA-1200 ,, 64GB (2X32GB) XPG GAMMIX D45 DDR4 3200MHz 288-Pin SDRAM PC4-25600 Memory .. Seasonic Power Supply SSR-1000FX Focus Plus 1000W ,, Arctic Liquid Freezer II – 360MM .. Fractal Design case ,, Samsung Solid State Drive MZ-V8P1T0B/AM 980 PRO 1TB PCI Express 4 NVMe M.2 ,, Wundiws 10 .. Vegas Pro 19 Edit

xberk wrote on 9/3/2007, 7:39 PM
Threads=2 RAM Preview=256 Time=0:44 <== sweet spot
------------------------------------------------------------------------------------
I just tried this "sweet spot" setting and got a 4 min time on the rendertest.veg.
I agree that a RAM previiew of zero significantly slows rendering, but it seems that
anything greater than zero ( even 1 ) works just as well as any other higher number.
At least on my Q6600, my times when using 4 threads are nearly twice as fast a 2 threads as long as RAM preview is greater than zero.

Paul B .. PCI Express Video Card: EVGA VCX 10G-P5-3885-KL GeForce RTX 3080 XC3 ULTRA ,,  Intel Core i9-11900K Desktop Processor ,,  MSI Z590-A PRO Desktop Motherboard LGA-1200 ,, 64GB (2X32GB) XPG GAMMIX D45 DDR4 3200MHz 288-Pin SDRAM PC4-25600 Memory .. Seasonic Power Supply SSR-1000FX Focus Plus 1000W ,, Arctic Liquid Freezer II – 360MM .. Fractal Design case ,, Samsung Solid State Drive MZ-V8P1T0B/AM 980 PRO 1TB PCI Express 4 NVMe M.2 ,, Wundiws 10 .. Vegas Pro 19 Edit

4eyes wrote on 9/3/2007, 9:27 PM
How does this relate to quality. I could probably understand rendering to cineform/dv (lossless) codecs.

But I've read in a few encoder programs that to many threads (not cpu's) for rendering can decrease quality. Especially when encoding compressed formats.
megabit wrote on 9/4/2007, 5:03 AM
JohnnyRoy,
May I know how much RAM you had on your system when you run your test?

AMD TR 2990WX CPU | MSI X399 CARBON AC | 64GB RAM@XMP2933  | 2x RTX 2080Ti GPU | 4x 3TB WD Black RAID0 media drive | 3x 1TB NVMe RAID0 cache drive | SSD SATA system drive | AX1600i PSU | Decklink 12G Extreme | Samsung UHD reference monitor (calibrated)

megabit wrote on 9/18/2007, 3:52 AM
I'd like to bump this up a little, because all the benchmark results contained herein are only valid to Vegas 7 - how about the VP8?

One thing I noticed is that "default all" in the Video preferences tab behaves differently than in V7 - I have 4GB of RAM (OK; some 3.2GB only visible under XP x86, but the same story applies to my Vista x64 installation). Vegas 7 defaults to 511 MB RAM buffer; VP8 defaults to 128 MB only!

So, did anyone find a "sweet spot" for VP8? Also, what are your Rendertest-hdv.veg benchmark results, in various RAM buffer, threads number and 32bit/8bit video handling scenarios?

AMD TR 2990WX CPU | MSI X399 CARBON AC | 64GB RAM@XMP2933  | 2x RTX 2080Ti GPU | 4x 3TB WD Black RAID0 media drive | 3x 1TB NVMe RAID0 cache drive | SSD SATA system drive | AX1600i PSU | Decklink 12G Extreme | Samsung UHD reference monitor (calibrated)

xberk wrote on 9/18/2007, 9:57 PM
Reran the render test on my Q6600 with 4 gig of Ram using VP8.
Getting 2 min 14 sec with Dynamic Ram set at anything above zero.
(About the same on under V7).
Zero Dynamic Ram shoots the times up to about 8 plus minutes!!!
Setting threads to less than 4 moves times up about as you would expect,
3 thread comes in at about 3 min and setting to 2 threads give me about 4 mins.

I conclude that VP8 is not rendering this test faster, but I've tried a few other real projects V7 vs VP8 on the same Q6600 machine and times have improved about 10% under VP8 .. I think it depends on the material and FX etc etc..
Of course, the render test is set to 8 bit...Setting to 32 bit slows things down too.


Paul B .. PCI Express Video Card: EVGA VCX 10G-P5-3885-KL GeForce RTX 3080 XC3 ULTRA ,,  Intel Core i9-11900K Desktop Processor ,,  MSI Z590-A PRO Desktop Motherboard LGA-1200 ,, 64GB (2X32GB) XPG GAMMIX D45 DDR4 3200MHz 288-Pin SDRAM PC4-25600 Memory .. Seasonic Power Supply SSR-1000FX Focus Plus 1000W ,, Arctic Liquid Freezer II – 360MM .. Fractal Design case ,, Samsung Solid State Drive MZ-V8P1T0B/AM 980 PRO 1TB PCI Express 4 NVMe M.2 ,, Wundiws 10 .. Vegas Pro 19 Edit

JohnnyRoy wrote on 9/19/2007, 3:29 AM
> Threads=2 RAM Preview=256 Time=0:44 <== sweet spot
> ------------------------------------------------------------------------------------
> I just tried this "sweet spot" setting and got a 4 min time on the rendertest.veg.

You have to understand what I was measuring. What this says is that 256 preview is the sweet spot for 2 threads on my Quad Core. It is not the sweet spot for rendering in general. Using 4 threads will always render faster, especially if you have a Quad Core.

~jr
JohnnyRoy wrote on 9/19/2007, 3:32 AM
> JohnnyRoy, May I know how much RAM you had on your system when you run your test?

I have 4GB of RAM (of which XP Pro only sees 3.25GB). :( The specs for my system are on the PC Equipment page of my web site.

~jr
Sunflux wrote on 9/19/2007, 4:29 AM
When I first got V7 I did a bunch of dynamic memory tests and came up with the following for a (at the time) recommended render test:

0mb: 4:15
8mb: 4:45
16mb: 4:41
64mb: 4:38
96mb: 4:34
128mb: 3:51
192mb: 4:08
256mb: 2:33
384mb: 2:33
511mb: 2:31 (default)
512mb: 2:32
1024mb: 2:34

Don't know how this translates to VP8 yet.