CUDA 3.2 RC2

CUDA 3.2 RC2 is now available:

http://developer.nvidia.com/object/cuda_3_2_toolkit_rc.html

CUDA 3.2 RC2 is now available:

http://developer.nvidia.com/object/cuda_3_2_toolkit_rc.html

Any improvement in “H.264 Encoding Performance” ? (especially on the compression than speed)

Any improvement in “H.264 Encoding Performance” ? (especially on the compression than speed)

Cool, this certainly proves that this is not going to be a rushed release (the 3.0 and 3.1 releases each had just one beta version prior to release)

Any word on why the version number of the CUDA 3.2 linux dev driver has actually decreased between RC and RC2? Usually you’d kind of expect the opposite ;)

RC:

devdriver_3.2_linux_32_260.24.run

RC2:

devdriver_3.2_linux_32_260.19.14.run

Cool, this certainly proves that this is not going to be a rushed release (the 3.0 and 3.1 releases each had just one beta version prior to release)

Any word on why the version number of the CUDA 3.2 linux dev driver has actually decreased between RC and RC2? Usually you’d kind of expect the opposite ;)

RC:

devdriver_3.2_linux_32_260.24.run

RC2:

devdriver_3.2_linux_32_260.19.14.run

With RC2 on OS X 10.6.4, I’m seeing the error about the Driver and Runtime version mismatch that other people reported previously. (Didn’t try RC1 on my laptop…) The driver version in the CUDA preference panel is 3.2.13, GPU driver version 1.6.18.18 (19.5.9f02).

Edit: I’m using the current model MacBook Pro 13" with a GeForce 320M GPU. (This is the model that only has the external GPU.)

With RC2 on OS X 10.6.4, I’m seeing the error about the Driver and Runtime version mismatch that other people reported previously. (Didn’t try RC1 on my laptop…) The driver version in the CUDA preference panel is 3.2.13, GPU driver version 1.6.18.18 (19.5.9f02).

Edit: I’m using the current model MacBook Pro 13" with a GeForce 320M GPU. (This is the model that only has the external GPU.)

My problem was fixed with re-generating ptx files.

I received some linking errors at first, which were fixed by adding the -m64 flag to nvcc. According to the nvcc documentation that should only be necessary for the linux platform, but it seems it is now also necessary for win7.

My problem was fixed with re-generating ptx files.

I received some linking errors at first, which were fixed by adding the -m64 flag to nvcc. According to the nvcc documentation that should only be necessary for the linux platform, but it seems it is now also necessary for win7.

I’m dissappointed to see that this update hasn’t done anything to improve performance (of texture sampling in particular) on the GTX 460. Anyone managed to get over 18GTexels/s yet?

I’m dissappointed to see that this update hasn’t done anything to improve performance (of texture sampling in particular) on the GTX 460. Anyone managed to get over 18GTexels/s yet?

Glad to hear you’re liking this new process; it does seem to be helping. Feedback welcome.

It’s simply because of the way our multiple driver code branches are structured. In this case, the new driver was cut from a sub-branch that forked off sometime prior to the RC1 driver build. Version numbers generally increase over time, but in the short term, a higher version number does not necessarily make it newer. (260.24 was built on 2010/09/09; 260.19.14 was built on 2010/10/18.)

–Cliff

Glad to hear you’re liking this new process; it does seem to be helping. Feedback welcome.

It’s simply because of the way our multiple driver code branches are structured. In this case, the new driver was cut from a sub-branch that forked off sometime prior to the RC1 driver build. Version numbers generally increase over time, but in the short term, a higher version number does not necessarily make it newer. (260.24 was built on 2010/09/09; 260.19.14 was built on 2010/10/18.)

–Cliff

Thanks for the update. You had me a bit worried there at first, since MATLAB R2010b actually has its own copy of the CUDA Runtime libraries built into it – and they’re version 3.1, not 3.2.anything. :) Your explanation does make sense, though. I’ll check on the -m64 documentation.

–Cliff

Thanks for the update. You had me a bit worried there at first, since MATLAB R2010b actually has its own copy of the CUDA Runtime libraries built into it – and they’re version 3.1, not 3.2.anything. :) Your explanation does make sense, though. I’ll check on the -m64 documentation.

–Cliff

I was also worried :) The need to make new a new ptx file was unexpected for me, but hey, it’s the first version of the toolbox.

Just out of curiosity (I can also check with the mathworks). As far as I understand the toolbox requires the 3.1 toolkit or later to be installed, so I would be surprised if they have their own copy built into it (it would be a strange requirement if they have their own copy). Also as far as I understand it, they use ptx JIT compilation, so it has to be the driver API they use or do the runtime libraries include the driver API (I thought to use the driver API only a supporting driver needed to be available)

I saw I removed my other comment, so here it is again: The kernel from .ptx file calling from matlab is very nice, and can also be very useful for benchmarking a kernel as function of grid & block size, apart from the nice testing against matlab.

I was also worried :) The need to make new a new ptx file was unexpected for me, but hey, it’s the first version of the toolbox.

Just out of curiosity (I can also check with the mathworks). As far as I understand the toolbox requires the 3.1 toolkit or later to be installed, so I would be surprised if they have their own copy built into it (it would be a strange requirement if they have their own copy). Also as far as I understand it, they use ptx JIT compilation, so it has to be the driver API they use or do the runtime libraries include the driver API (I thought to use the driver API only a supporting driver needed to be available)

I saw I removed my other comment, so here it is again: The kernel from .ptx file calling from matlab is very nice, and can also be very useful for benchmarking a kernel as function of grid & block size, apart from the nice testing against matlab.

Is there somewhere I can find a change list for 3.2RC2 compared to 3.2RC1? I looked at the release notes but it looks to be basically the same as RC1 release notes. We have done product testing on the RC1 and I need to know what has changed.

-Thanks, Derek

Is there somewhere I can find a change list for 3.2RC2 compared to 3.2RC1? I looked at the release notes but it looks to be basically the same as RC1 release notes. We have done product testing on the RC1 and I need to know what has changed.

-Thanks, Derek