Quad-core processor usage during editing

RHKFilm wrote on 6/30/2008, 11:38 PM
I have an Intel Q6600 and have noticed that my processors never get fully utilized. The CPU maxes out at 50% of all 4 cores when playing back full raster 1080p cineform files and not reaching realtime playback when set to Preview>Full.

I also tried dumping lots of color-correction effects on a video file and could not get it vegas 8 to use more than 75% of my processing power even when that alone should have been the clear bottleneck and playback slowed to a crawl.

Why wonn't vegas use all of my processor power?

-Robert

Comments

John_Cline wrote on 7/1/2008, 1:34 AM
JohnnyRoy explained this quite eloquently last April. Here's what he said:

"What people who don't understand how computers do multi-tasking don't realize is that multiple cores can only be utilized when the task has steps that can be done in parallel. Work doesn't magically use all of the cores at 100%.
Terje wrote on 7/1/2008, 4:32 AM
However, rendering is one of those things that should "easily" lend it self to parallel work. You can have four threads rendering four different parts of the movie simultaneously, for example. Let's say you are doing MPEG encoding, give each thread one picture group to render, once either is finished, the group is stored in memory until it can be written to disk in the right sequence, and then the thread moves on to the next group. You have a separate thread, that can be a 5th one since it runs with less resources, that writes the rendered memory chunks to disk.

All of this requires a lot of memory though.
John_Cline wrote on 7/1/2008, 4:59 AM
Yes, rendering can be multithreaded quite easily. The original question was about playing the file, which is real-time and sequential, and that doesn't lend itself to multithreading.
TheHappyFriar wrote on 7/1/2008, 6:18 AM
also, you're using HD. I've noticed that on MY rig, HD doesn't use 100% of all cores at all times when rendering. looks like it's a throughput issues with my external drive. So if you can't keep data up to the rate the CPU can crunch, you still won't get 100% CPU usage (seems impossible to get 100% too, more like 99% is the max)
Robert W wrote on 7/1/2008, 7:16 AM
You know, I think people are tending to treat multithreaded code very lightly. This idea that it is inevitable that there will be redundancy is completely ridiculous. The reason there is redundancy comes down to the way they organise the instructions and procedures to run on multiple cores. While Vegas was early on the uptake of the potential of multi threading, it's performance shows that it deals with it in a relatively primitive way.

If Vegas was actually held back by the "builders standing around a hole" theory, you would find one processor continuously maxing out while the others were below maximum. On a quad core system each processor has access to the same memory registers. It is the degree at which you propagate the instruction load that dictates the amount of redundancy in each core.

Even if handled on a procedural basis, where each processor handles one procedure at a time and then moves onto the next, there should be no point where a processor is waiting idle for the next processor to pass it information it is waiting for, otherwise it makes multi-core processing completely pointless.

What I think happens with Vegas at present is that it allocates certain procedures to cores in a rather arbitrary and uncoordinated fashion. I am sure that with careful planning and optimizing for particular numbers of cores, the Vegas engines could yield much better performance.

However, it does not even need to take that level of thinking to make it that much more efficient. The original poster has a system that does not play back in full rate realtime, yet processor usage is only 50% in each core. I am simplifying the challenge here slightly, but there is a rather obvious way to get around this.

Say you are working with 1080i 50hz Z1E streams, and with one core enabled you can play back at 14 frames per second. Sharing the load across three other on that stream you may achieve 18 frames and get better multitasking performance across the system. However, if you were to allocate the processing like this (each chunk is keyframe to keyframe):

Core 1: 1st chunk Core 2: 2nd Chunk
Core 1: 3rd chunk Core 2: 4th Chunk
Core 1: 5th chunk Core 2: 6th Chunk and so on.

That way you could achieve 25 frame playblack comfortably, with two cores remaining to handle audio and the operating system.

The situation is of course more complicated than this (key frames do not all land in sync and it does not take into account compositing etc.) , but the basic principles are valid in most applications. This is the kind of jump of thinking they need to make multi core processing work properly. Ideally they will start develop the in program options to allow power users to tailer the product to their requirements, and even specify how particularly tasks are distributed to different processors.
JJKizak wrote on 7/1/2008, 7:24 AM
All I know with mine (default settings) (HDV) is when you push it with complexity it goes to 100% and the frames go slower. Non complex it rattles along at 50% put the frames go by in a blur. V8.0b, Q6600. Vista 64, 8 gig ram.
JJK
rmack350 wrote on 7/1/2008, 7:37 AM
Well, on the face of it RHK is describing a disc throughput problem, or to put it another way, if he had a disc throughput problem then that's how it would manifest. On the other hand, others have reported the same thing regarding playback and core usage so I wouldn't blame it on throughput automatically.

Bluproject recently said that Dan at Cineform told him that vegas uses cineform better with high core speeds than it does with multiple cores - a high MHz Core2Duo performs better than a lower MHz Quad.

RHK, What is the file size and running time of one of your full raster 1080p Cineform files? Should be simple to figure out how much disk throughput you need and then HDTach could tell you how much you've got.

Rob Mack
farss wrote on 7/1/2008, 7:48 AM
Unfortunately it's not that simple. I know you said that but I think both you and Terje have overlooked one issue, temporal FXs. Assuming a 15 frame GOP you cannot begin to calculate frame 16 in core 2 until core 1 has calculated frame 15 and has any number of prior frame available. Probably the worst case is using motion blur.
As Terje did point out you also need a fair amount of RAM. Once you're in HD that starts to get significant and doubles once you go beyond 8 bits.

Bob.
Robert W wrote on 7/1/2008, 8:11 AM
The example I gave was just a microcosm of the overall approach that would apply for any process or any set of processors. Multicore programming is all amount management. It is a symptom of the brute force single core legacy of PC and even Mac coding that is making it so difficult for them to catch up. They need to start looking back to systems like the Amiga that was effectively the first multicore platforms and code accordingly. Instead of having blokes standing around a hole, get them doing the many various tasks that need to be done in the right order, making the most efficient use. It is a change in mindset of the coders that is required.
RHKFilm wrote on 7/1/2008, 8:47 AM
Hi Rob,
I am playing back Cineform compressed AVIs that are 1080p / 23.976fps from a RAID that clock out in HDTach at above 250MB/s. I believe the Cineform data rates are <10MB/s. This leads me to believe it isn't a disk throughput problem. That said, I did notice a small but significant increase in framerate when playing the the files back from the RAID as opposed to a 3gbps SATA drive.

I am able to play back Sony YUV (basically uncompressed 4:2:2) files from my RAID at 133MB/s in realtime without dropped frames.

Also, even when I add effects and such I cannot get my processors to go above 75% usage which leads me to believe that the bottleneck is Vegas as opposed to my hardware.

Here is my system:
Intel Core 2 Quad Q6600 Kentsfield 2.4GHz LGA 775 Quad-Core Processor
Intel BLKD975XBX2KR LGA 775 Intel 975X ATX Intel Motherboard
CORSAIR XMS2 2GB (2 x 1GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400)
Area 1220ML RAID controller
8x Seagate 7200.10 320GB HD's in RAID 6
1x Samsung 300GB system drive
PNY Quadro FX1500 w/ performance drivers
BenQ 24" 1920x1200 FP241WZ
Windows Vista SP1 32-bit
Vegas Pro 8.0b

Am I wrong in believing that since this computer can ENCODE these Cineform files in realtime with no issues it should definitely be able to play them back on the timeline? By the way, playing back the Cineform files with WMP results in about 30% processor usage with no apparent dropped frames.

I built a new computer to get realtime full res playback and it STILL doesn't work.. it's gotta be Vegas, right?

Thanks,
Robert
TheHappyFriar wrote on 7/1/2008, 9:09 AM
I believe it is a disc throughput issue. Generated media @ HD res with HD preview runs MUCH better then using any type of video. Yeah, the CPU's aren't anywhere near 100%, but the exact same stuff done to a project of mine with just generated media is much faster then video, DV or HD.

With playing with my quad core many things run at 30fps @ auto preview quality but stutter @ full size quality. Even simple cuts can pause slightly. I'm betting it's related to the disc transfer rate because 1 or 2 FX alone on a single clip don't slow down the machine, but once I add multiple clips & what not it happehns.
Robert W wrote on 7/1/2008, 9:39 AM
If it runs ok in preview mode then it indicates that it is not a disc throughput issue as the throughput will be consistent no matter what preview mode you run in. The only thing that changes is the processor load. So if you processor remains significantly below the 100% and plays back poorly in higher res modes, it indicates the tasks are not being distributed correctly.

People really need to keep in mind that all four cores have access to the same main memory space. It is not like any one plugin or task has to run exclusively on any one processor. And even if the architecture dictates that individual procedures have to be run on one core, there will be thousands of procedures being performed concurrently in any one operation.

It it easy to slap "Multithreading" labels on software, but it is not something you can switch on or off or automatically optimize for in a compiler. It requires a whole shift in thinking. I think at the moment the Vegas implementation of multi core threading is not an appalling start, and indeed works fairly well on the early hyperthreading Prescott platforms, but it needs to be progressed to take advantage of the current high end machines. You only have to look at the hidden menu and see that you need to specify by hand that it is to make use of Dual Quad processor arrangements to see something is not quite up to speed in Vegas 8.0.

Incidently, all this talk of Vegas 9, I don't know about you folks but I think they should not think about that until they have ironed out all of the bugs. I feel a Vegas 8.5 is due as a free upgrade because this platform has never been solid for me.
rmack350 wrote on 7/1/2008, 10:21 AM
Yep, seems like it's got to be Vegas. Can Vegas play back the cineform footage in real time in the trimmer? Trimmer has very little overhead compared to the timeline.

Rob
rmack350 wrote on 7/1/2008, 10:34 AM
Yes but Robert says he's got lots of throughput.

It's not always a sure thing. We went through a few arrays with our 844x system and many of them would benchmark fine but 844x wouldn't get the same throughput. These ranged from something like 8-16 disk arrays. Eventually we settled on an array that would work, and then a year later we stopped banging our heads against 844x and started banging our heads against PPro/Axio.

Don't know what to say about this. The topic has come up so many times that all I could possibly do is repeat ideas that have already bombed out.

I'd probably go into task manager and change Vegas processor affinity to just one, then two, and see if that changes anything.

Rob
TheHappyFriar wrote on 7/1/2008, 10:35 AM
If it runs ok in preview mode then it indicates that it is not a disc throughput issue as the throughput will be consistent no matter what preview mode you run in. The only thing that changes is the processor load

not true. like I said, I can run things just fine on preview - auto but then bump up to full & it chugs. My disc light starts blinking like mad & the preview keeps slowing down.

try it yourself: [url=http://sterlingshield.net/home/steve/vegassite/templates/3d track motion.veg]

just replace the missing media with various HD & DV clips on your drive (6 total). I used 3 off my external drive & 3 off my SATA drive. I had the generated media part running 29.97fps when on preview - full setting. The part with the clips starts @ 10 leveled off ~6 or so fps. Using preview - half my fps was always above 10.
Robert W wrote on 7/1/2008, 10:48 AM
I'm not sure that would work out to be a very controlled test.

Again, this seems to indicate to me a processor issue rather than a disc throughput issue. Whatever happens, Vegas has to base it's playback on the same data on the disc, It is (to the best of my knowledge) not like it loads less data in preview mode. There could be an impact if the processor load of drawing data from the external drive causes contention with the extra load of playback at a higher resolution.

The other vague possibility is that the higher res playback has higher memory requirements and that may reduce the amount Vegas has available it to cache data from all drives, particularly external drives.
TheHappyFriar wrote on 7/1/2008, 11:35 AM
I'm not sure that would work out to be a very controlled test.

It's a real world test I used. I wanted to do that to some footage, that's why I used it.