Compilation problem: CUDA is broken?


i didn’t touch my project for some months… yesterday, when i tried to rebuild it i found that CUDA is “broken”.

I get tons of macro errors from the CUDA compiler 2.3 for things like constant or shared or any other CUDA specific macro!

Some examples, tons of these:

That happen at the same lines of these:

and other random stub.c errors:

The project was perfectly working with CUDA 2.2, now i’m compiling with CUDA 2.3.

I reinstalled any SDK, even rebuilt the project ground up… but i can’t figure what is wrong with CUDA macros.

Any ideas??

Up… can’t solve this!

Can you post some self-contained code that reproduces these errors?

did you switch your nvidia driver to version 190.* ?
cuda 2.3 doesn’t work with driver 185.* and previous.

I would like to, but unfortunately ANY variable defined as shared or constant produces the exact same errors… i tried to exclude the more code possible, but i noticed that if there is even ONE “normal” error (like symbol not found etc) these errors won’t show up.

So i should build a working but stripped down version of this, and it’s really annoying.

Yeah i run 190.62 WHQL.

Maybe there are some conflicts with the old CUDA 2.2? I disinstalled it, but i was previously using CUDA x86 on my x64 Windows… i noticed that while CUDA 2.2 had a version-specific SDK, CUDA 2.3 contains both x86 and x64.

Could this be a problem, given that i try to use the first on the second?


this code

struct __align__(16) cudaMatrix4



			a11, a12, a13, a14,

			a21, a22, a23, a24,

			a31, a32, a33, a34,

			a41, a42, a43, a44;


__constant__ cudaMatrix4 worldViewProj;

__constant__ cudaMatrix4 worldViewProjIT;

__constant__ cudaMatrix4 world;

__constant__ cudaMatrix4 worldIT;

__constant__ cudaMatrix4 worldView;

__constant__ cudaMatrix4 worldViewIT;

__global__ void vertexKernel







Does cause error with the constant definitions… and there’s like nothing else in the .cu file.

Still no clue! :wacko:

#include <cutil_inline.h>
#include <cutil.h>

On recompiling a kernel which works with CUDA 3.1 with newly installed CUDA 4.0 I get error: macro "__loc_sc__" passed 3 arguments, but takes just 2


extern __shared__ unsigned int shared_array[];



unsigned int Constant[15*1024]; //leave 1kwords free for other constants

I think this is because I have told nvcc (via -I) to use the CUDA 3.1 include files.

The compilation errors are removed when -I/usr/local/cuda/SDK/C/common/inc/ is used instead

The first call to cudaMalloc failed with

main.cpp(440) : cudaSafeCall() Runtime API error 10200: unspecified driver error.

until the image linker (g++) was told to build the image (via -L) with CUDA 4.0 object libraries


ps: with the new driver C2050 is approx 25 mS faster

You didn’t mention upgrading the driver. Did you do that?

I’d be really surprised if mixing include files from different CUDA versions worked. I certainly wouldn’t call CUDA broken if it didn’t.

I think the nvidia driver was updated to 2.6.18-164.15.1.el5

(ie /sbin/modinfo nvidia


vermagic:       2.6.18-164.15.1.el5 SMP mod_unload gcc-4.1

We are running 64 bit Centos)


Sure. The point of my posting was to record my error, in case anyone else
hits something similar.