Bug on Windows

I’ve received a bug report from a user. He’s running my code on 64 bit Windows 7 with Cuda 4.0, driver 275.33, and a Quadro FX 380M (which supports compute level 1.2). The kernel compiler fails with the following log:

ptxas application ptx input, line 128; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 129; error   : Instruction 'mov' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 130; error   : Instruction 'div' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 163; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 164; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 165; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 166; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 167; error   : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 168; error   : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 169; error   : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 170; error   : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 178; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 179; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas application ptx input, line 180; error   : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

ptxas fatal   : Ptx assembly aborted due to errors

ptxas application ptx input, line 128; warning : Double is not supported. Demoting to float

It looks like the compiler is getting confused and generating 1.3 instructions on a device that only supports 1.2. It clearly realizes it shouldn’t do that, as seen from the last line: “Double is not supported. Demoting to float” But it does it anyway.

Has anyone seen anything like this before? Any idea what I can do about it?

Peter

This will probably not help much, but doesn’t SM stand for Shader Model and aren’t you confusing that with the Compute Capability? You’re right that the Quadro FX 380M has Compute Capability 1.2, though.

I don’t believe that’s correct. CUDA generally uses the terms “compute capability” and “SM” as equivalent. For example, from section 3.1.2 of the CUDA C Programming Guide: “A cubin object is generated using the compiler option –code that specifies the targeted architecture: For example, compiling with –code=sm_13 produces binary code for devices of compute capability 1.3.” The reference to a “map_f64_to_f32 directive” also shows pretty clearly it’s a single vs. double precision issue.

Peter

Does anyone at Nvidia actually read this forum?

Peter

Quadro FX 380M not support Double-precision floating-point operations http://en.wikipedia.org/wiki/CUDA

Yes, I’m well aware of that. Yet the compiler appears to be generating double precision instructions for it anyway.

Peter