Comments

jwcarney wrote on 8/23/2010, 10:59 AM
Octane Render is another purpose built app that uses Cuda exclusively.
NonBiased renderer that can do in minutes what used to take several hours.
rmack350 wrote on 8/23/2010, 2:31 PM
Bob said Of the 16 PCIe buss lanes, 15 are write and only one is read.

Hey Bob, that's new to me. I thought one of the big deals about that 16x PCIe interface was that it was a lot more bidirectional than AGP. Do you have some sources? I'm not challenging you on it, it's just the sort of background knowledge I need for work anyway so I'd love to read something.

I suspect what's really going on is that, while the 16x slot is capable of a lot more bidirectional traffic, the cards were (and maybe still are) based on AGP designs which really didn't provide that much bandwidth back to the CPU.

When you think about it, there's still not much reason for a consumer graphics card to support much return-trip bandwidth. For most consumer needs a single lane would be enough. Video-wise, if you can run an HD display from a PCIe x1 ION board then you obviously could get by with just one return lane from the GFX card. It's still coming back faster than your hard drive can record it.

Rob
megabit wrote on 8/23/2010, 3:12 PM
Speaking of consumer vs. professional needs: one of the very important aspects of how the GPU cores are used by a CUDA-supporting application, related to the cards capability of "read" and "write", is how much job can be done by the GPU "in place", i.e. using the graphic card's own memory.

I tried several video applications using CUDA, and the amount of memory on the card didn't seem to be of a much importance. On the other hand, one of the first commercial application (FEA-based Moldflow system) to use the power of GPU will only harness its cores if they have access to enough on-card memory. Depending on the model size, Gigabytes of memory may bee needed!

I don't think hardware H.264 encoding uses graphics card memory at all, but I may be mistaken.

Piotr

AMD TR 2990WX CPU | MSI X399 CARBON AC | 64GB RAM@XMP2933  | 2x RTX 2080Ti GPU | 4x 3TB WD Black RAID0 media drive | 3x 1TB NVMe RAID0 cache drive | SSD SATA system drive | AX1600i PSU | Decklink 12G Extreme | Samsung UHD reference monitor (calibrated)

Rob Franks wrote on 8/23/2010, 5:07 PM
"Not surprising that we now have Rob Franks having a go at 'farss".

And we now have Malcolm D having a go at Rob Franks.
If you can't control your aggression Malcolm... then... well, you know the rest.
Geoff_Wood wrote on 8/23/2010, 8:18 PM
Isn't CUDA just another 'abstraction layer' between the app and hardware, to slow things down ? Sorry - apart from the the basisc principle , I really don't know ;-)

geoff
farss wrote on 8/24/2010, 1:28 AM
You are right, the PCI buss lanes are all full duplex, it is how they are used on the video card itself that I was referring to.

Bob.
jabloomf1230 wrote on 8/24/2010, 11:33 AM
"Isn't CUDA just another 'abstraction layer' between the app and hardware, to slow things down ? Sorry - apart from the the basic principle , I really don't know ;-)"

In general, that would be true, but in the case of CUDA it is usually not true. GPUs were original designed to send pixels to the display device. Then some bright people figured out that some software did not take full advantage of the GPU and it was sitting around twiddling its silicon thumbs, while the CPU cores were number crunching. Since the GPU cores were doing very little anyway, why not write a programming interface (CUDA) that uses the GPU's limited instruction set to do CPU-like tasks?

kplo wrote on 8/24/2010, 12:17 PM
Wouldn't Vegas require a major re-write to take true advantage of hardware acceleration and move from the ancient VFW model to Direct Show?
I see posts regarding this from time to time. Still true?
Ken
farss wrote on 8/24/2010, 3:25 PM
""Isn't CUDA just another 'abstraction layer' between the app and hardware, to slow things down ?"

I believe that's correct apart from the "to slow things down" part. Yes, it is slower going through an abstraction layer rather than accessing the hardware directly however having a standard interface makes programming much easier. The guys who develop the hardware also write the abstraction code and the guys who write the application don't need to know about the specifics of the hardware design. Without that abstraction layer every new video card or sound card could require a rewrite of some of the application code.

That's my understanding anyway :)

Bob.
Fotis_Greece wrote on 8/26/2010, 4:07 PM
Anyway, topic went really off, but this is from me.
No cuda in next version, no buy.

P.S Just got a GTX 470 to test with PP CS5. Rendering an hour timeline HDV to mpeg2 Blu Ray CBR 25mbs with heavy GPU accelarated filters and effects (fully changed the RGB curves) for testing.
Couldn't believe my eyes, it took 26 minutes!!
(I7 920, W7 64, 9GB Ram)
zstevek wrote on 8/26/2010, 5:54 PM
That's a fast card!

I just got a GTX 465 and am pleased with the speed. It does run very hot though. What temps does your 470 reach when you're GPU is number crunching?

I run folding@home (24/7) which is heavy on the GPU and my temp ranges from 70C to 77C.
Fotis_Greece wrote on 8/27/2010, 9:46 AM
up to 85 or even 91 Celcius when rendering with Mercury Playback Engine