Does anyone know the Compute Capability of GTX260M?

Confusingly (for its name), it is a compute capability 1.1 part.

Even more confusingly, the newer GTS260M is a compute capability 1.2 part. It’s significantly superior… much lower power, and the 1.2 capability gives it the double registers, better coalescing, and zero-copy benefits of G200, missing only the double-precision of capability 1.3.

A similar part is the non-mobile GT240, which turns out to be a pretty nice low-power (again in watts) board… it’s great for filling a single width slot, and needs no external power connectors.

The GTS260M also has GDDR5 memory.

I thought it was 1.1 as well, but I got one a couple weeks ago in a new ASUS laptop and it seems to be running as 1.0 capability. I print out the CUDA_ARCH define from C code and it is 100. Also, I tried to use the atomicAdd operation, and it does not seem to work (due to the CUDA_ARCH too low). It would not even compile. So, perhaps there are some new drivers on the way, but, when I set up my new machine I thought I pulled all the latest drivers. If you have any hints on how to use the 1.1 capability, let us all know!!!

This whole thing started when I was looking for a new laptop as a “Nexus Companion” to my CC1.3 desktop.

Since I am using "double"s in my kernels, I wanted to make sure that this would not be an issue when debugging via Nexus. Unfortunately I could not find any Mobile chips that are CC1.3. I am not even sure that this is an issue but do not want to take any chances.

Your problem is unrelated to your GPU… it’s just about your toolkit and compiler.

You can compile CUDA apps for any compute level with any CUDA installation… the GPU is not involved in compilation.

It’s likely just a makefile/define/include file issue… make sure you’re using the right nvcc flags such as -arch sm_13 .

Compute capability is 1.3 - I have two of them in the PC I am

typing this from.

– edited: Sorry, I missed the M at the end. I have GTX260 - without the M.

Actually after I thought about the prior comments I realized the same thing and tried it today: The default project files just compile to sm_10…i flipped

to sm_11 and my code runs on GTX260M. I am now using one of the 1.1 features I wanted to use (atomic functions) and it seems to run okay.

Now, sm_13 and double precision, I don’t think that will work on a GTX260M.

Has anyone thought about just making (or knows of?) a software implementation of double precision arithmetic for

the non-DP chips? I know it would be slow, but I think a lot of regular CPUs did DP in microcode “back in the day”. It would

be good for testing. You wouldn’t necessarily need the full complement of fp operations, maybe just the simplest IEEE ones to start:

add mult div … I think I will make this a separate post query as well.

Yes, in the pre-double precision days, a number of people used the dsfun90 library as a template for emulating double precision operations with single precision instructions. (If I remember correctly, double precision addition takes something like 17 instructions. Other operations are far slower.)

Here is a header that was posted in the forum a few years ago implementing the basic arithmetic operations (±*/):