I’d like to ask a 1-minute favor from someone on the forum who has a new Macbook Pro with the dual GPUs.
Could you run the SDK’s BandwidthTest application and post the results?
I am very interested in the transfer speeds for both the 9400M and 9600M. I wonder if the 9400M may actually be faster for transfers because of its embedded design!
Thanks very much if you could help me out!
Edit: Sigh, I would have to make a typo in the title topic, and it doesn’t look like you can edit those. So please enjoy teasing me about my misplaced interest in Machbooks and not MacBooks.
Ok I’ve attached a couple of files giving bandwidth for both devices (shmoo mode) with pinned memory. Not had a chance to study them. I’d be interested in seeing some graphs (and any variations) FYI this is MacBook Pro 17" with new firmware update applied - and CUDA OSX 2.0 be interesting to see if 2.2 brings any speed ups. BTW with double the number of cores and nearly double the clock rate I think I have my money on the 9600M for doing work. test1.txt (4.77 KB) test0.txt (4.84 KB)
the limiter is the pcie bus - not even CPU/RAM - remember the trick with GPU is once you have the job on the device keep it there… you can upload new kernels - but keep the data on the device - its the internals of the GPU that do the fast stuff… and that’s just where the story starts: check out the CUDA lectures online, the device itself has a fairly complex memory architecture that you really must understand to do any kernel development. You’ll soon forget about that PCIe bus… check out the device-device bandwidth for comparison - if you intend to rely on host-device round trips you’ll be missing the trick here… have fun! :)