good old "-lcudart not found" can't build examples

in trying to make an example, I get

$ ~/NVIDIA_CUDA_SDK/projects/deviceQuery$ make

/usr/bin/ld: cannot find -lcudart

collect2: ld returned 1 exit status

make: *** […/…/bin/linux/release/deviceQuery] Error 1

I’m on a 64 bit ubuntu ( lucid ) install. I think I’ve got the environment variables correctly configured :

$ echo $LD_LIBRARY_PATH

$ echo $PATH

lib64 has what I expect in it:

$ ls /usr/local/cuda/lib64

libcublas.so libcudart.so libcudart.so.3.2.12 libcufft.so.3.2.12 libcurand.so.3.2.12 libcusparse.so.3.2.12

libcublas.so.3 libcudart.so.2 libcufft.so libcurand.so libcusparse.so

libcublas.so.3.2.12 libcudart.so.3 libcufft.so.3 libcurand.so.3 libcusparse.so.3

hope the solution is obvious…

in trying to make an example, I get

$ ~/NVIDIA_CUDA_SDK/projects/deviceQuery$ make

/usr/bin/ld: cannot find -lcudart

collect2: ld returned 1 exit status

make: *** […/…/bin/linux/release/deviceQuery] Error 1

I’m on a 64 bit ubuntu ( lucid ) install. I think I’ve got the environment variables correctly configured :

$ echo $LD_LIBRARY_PATH

$ echo $PATH

lib64 has what I expect in it:

$ ls /usr/local/cuda/lib64

libcublas.so libcudart.so libcudart.so.3.2.12 libcufft.so.3.2.12 libcurand.so.3.2.12 libcusparse.so.3.2.12

libcublas.so.3 libcudart.so.2 libcufft.so libcurand.so libcusparse.so

libcublas.so.3.2.12 libcudart.so.3 libcufft.so.3 libcurand.so.3 libcusparse.so.3

hope the solution is obvious…

LD_LIBRARY_PATH has no effect on compilation, it only controls how shared libraries are located at runtime. Check the value of CUDA_INSTALL_PATH defined in C/common/common.mk. You can change it in that file or just do

CUDA_INSTALL_PATH=/usr/local/cuda make

and it should probably work.

LD_LIBRARY_PATH has no effect on compilation, it only controls how shared libraries are located at runtime. Check the value of CUDA_INSTALL_PATH defined in C/common/common.mk. You can change it in that file or just do

CUDA_INSTALL_PATH=/usr/local/cuda make

and it should probably work.

no success

:~/NVIDIA_CUDA_SDK/projects/deviceQuery$ CUDA_INSTALL_PATH=/usr/local/cuda make

/usr/bin/ld: cannot find -lcudart

collect2: ld returned 1 exit status

make: *** […/…/bin/linux/release/deviceQuery] Error 1

no success

:~/NVIDIA_CUDA_SDK/projects/deviceQuery$ CUDA_INSTALL_PATH=/usr/local/cuda make

/usr/bin/ld: cannot find -lcudart

collect2: ld returned 1 exit status

make: *** […/…/bin/linux/release/deviceQuery] Error 1

no ideas ?

no ideas ?

Ok. well, I’m till stumped on this, its a little bit beyond my debugging and linux capabilities. Even a suggestion as to what man pages to read would be helpful at this point. Really, anything.

Ok. well, I’m till stumped on this, its a little bit beyond my debugging and linux capabilities. Even a suggestion as to what man pages to read would be helpful at this point. Really, anything.

Try compiling directly from the command line inside the deviceQuery source directory like this:

nvcc -I../../../shared/inc -L../../../shared/lib/ deviceQuery.cpp -lshrutil_x86_64 -o deviceQuery

You should see something like this (using the 3.1 SDK on 64 bit Ubuntu 9.04):

avidday@cuda:/opt/cuda-3.1/sdk/C/src/deviceQuery$ nvcc -I../../../shared/inc -L../../../shared/lib/ deviceQuery.cpp -lshrutil_x86_64 -o deviceQuery

deviceQuery.cpp:126:11: warning: extra tokens at end of #else directive

deviceQuery.cpp:135:11: warning: extra tokens at end of #else directive

avidday@cuda:/opt/cuda-3.1/sdk/C/src/deviceQuery$ ./deviceQuery 

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There are 2 devices supporting CUDA

Device 0: "GeForce GTX 470"

  CUDA Driver Version:                           3.20

  CUDA Runtime Version:                          3.10

  CUDA Capability Major revision number:         2

  CUDA Capability Minor revision number:         0

  Total amount of global memory:                 1341849600 bytes

  Number of multiprocessors:                     14

  Number of cores:                               448

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       49152 bytes

  Total number of registers available per block: 32768

  Warp size:                                     32

  Maximum number of threads per block:           1024

  Maximum sizes of each dimension of a block:    1024 x 1024 x 64

  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             512 bytes

  Clock rate:                                    1.22 GHz

  Concurrent copy and execution:                 Yes

  Run time limit on kernels:                     No

  Integrated:                                    No

  Support host page-locked memory mapping:       Yes

  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

  Concurrent kernel execution:                   Yes

  Device has ECC support enabled:                No

Device 1: "GeForce GTX 275"

  CUDA Driver Version:                           3.20

  CUDA Runtime Version:                          3.10

  CUDA Capability Major revision number:         1

  CUDA Capability Minor revision number:         3

  Total amount of global memory:                 938803200 bytes

  Number of multiprocessors:                     30

  Number of cores:                               240

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       16384 bytes

  Total number of registers available per block: 16384

  Warp size:                                     32

  Maximum number of threads per block:           512

  Maximum sizes of each dimension of a block:    512 x 512 x 64

  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             256 bytes

  Clock rate:                                    1.46 GHz

  Concurrent copy and execution:                 Yes

  Run time limit on kernels:                     Yes

  Integrated:                                    No

  Support host page-locked memory mapping:       Yes

  Compute mode:                                  Prohibited (no host thread can use this device)

  Concurrent kernel execution:                   No

  Device has ECC support enabled:                No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.10, NumDevs = 2, Device = GeForce GTX 470, Device = GeForce GTX 275

PASSED

If that doesn’t work, then something is broken with your SDK installation. If it does, then the common.mk file in your SDK needs some changes.

Try compiling directly from the command line inside the deviceQuery source directory like this:

nvcc -I../../../shared/inc -L../../../shared/lib/ deviceQuery.cpp -lshrutil_x86_64 -o deviceQuery

You should see something like this (using the 3.1 SDK on 64 bit Ubuntu 9.04):

avidday@cuda:/opt/cuda-3.1/sdk/C/src/deviceQuery$ nvcc -I../../../shared/inc -L../../../shared/lib/ deviceQuery.cpp -lshrutil_x86_64 -o deviceQuery

deviceQuery.cpp:126:11: warning: extra tokens at end of #else directive

deviceQuery.cpp:135:11: warning: extra tokens at end of #else directive

avidday@cuda:/opt/cuda-3.1/sdk/C/src/deviceQuery$ ./deviceQuery 

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There are 2 devices supporting CUDA

Device 0: "GeForce GTX 470"

  CUDA Driver Version:                           3.20

  CUDA Runtime Version:                          3.10

  CUDA Capability Major revision number:         2

  CUDA Capability Minor revision number:         0

  Total amount of global memory:                 1341849600 bytes

  Number of multiprocessors:                     14

  Number of cores:                               448

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       49152 bytes

  Total number of registers available per block: 32768

  Warp size:                                     32

  Maximum number of threads per block:           1024

  Maximum sizes of each dimension of a block:    1024 x 1024 x 64

  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             512 bytes

  Clock rate:                                    1.22 GHz

  Concurrent copy and execution:                 Yes

  Run time limit on kernels:                     No

  Integrated:                                    No

  Support host page-locked memory mapping:       Yes

  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

  Concurrent kernel execution:                   Yes

  Device has ECC support enabled:                No

Device 1: "GeForce GTX 275"

  CUDA Driver Version:                           3.20

  CUDA Runtime Version:                          3.10

  CUDA Capability Major revision number:         1

  CUDA Capability Minor revision number:         3

  Total amount of global memory:                 938803200 bytes

  Number of multiprocessors:                     30

  Number of cores:                               240

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       16384 bytes

  Total number of registers available per block: 16384

  Warp size:                                     32

  Maximum number of threads per block:           512

  Maximum sizes of each dimension of a block:    512 x 512 x 64

  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             256 bytes

  Clock rate:                                    1.46 GHz

  Concurrent copy and execution:                 Yes

  Run time limit on kernels:                     Yes

  Integrated:                                    No

  Support host page-locked memory mapping:       Yes

  Compute mode:                                  Prohibited (no host thread can use this device)

  Concurrent kernel execution:                   No

  Device has ECC support enabled:                No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.10, NumDevs = 2, Device = GeForce GTX 470, Device = GeForce GTX 275

PASSED

If that doesn’t work, then something is broken with your SDK installation. If it does, then the common.mk file in your SDK needs some changes.

Thanks !

That is helpful. In attempting to execute that instruction, I realized that my SDK directory is configured differently : the source isn’t in a separate “C” directory and its “common”, not “shared”. I realize now that I’m using an older project directory, but this machine must be set up for a more recent installation. Therefore, I assume that all the makefiles are slightly wrong in my projects folder, for the current version.

So, thanks again for the help. by the way, this is what happens when I run a variation of your suggestion :

$ nvcc -I../../common/inc -L../../common/lib/ deviceQuery.cpp -lshrutil_x86_64 -o deviceQuery

deviceQuery.cpp: In function ‘int main(int, char**)’:

deviceQuery.cpp:109: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:114: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:115: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:127: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:128: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

/usr/bin/ld: cannot find -lshrutil_x86_64

collect2: ld returned 1 exit status

Let me know if there is anything else obviously wrong. For now, I will try to track down which version of CUDA is installed and build an example from_that_version to see if things are working.

Thanks !

That is helpful. In attempting to execute that instruction, I realized that my SDK directory is configured differently : the source isn’t in a separate “C” directory and its “common”, not “shared”. I realize now that I’m using an older project directory, but this machine must be set up for a more recent installation. Therefore, I assume that all the makefiles are slightly wrong in my projects folder, for the current version.

So, thanks again for the help. by the way, this is what happens when I run a variation of your suggestion :

$ nvcc -I../../common/inc -L../../common/lib/ deviceQuery.cpp -lshrutil_x86_64 -o deviceQuery

deviceQuery.cpp: In function ‘int main(int, char**)’:

deviceQuery.cpp:109: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:114: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:115: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:127: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

deviceQuery.cpp:128: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘size_t’

/usr/bin/ld: cannot find -lshrutil_x86_64

collect2: ld returned 1 exit status

Let me know if there is anything else obviously wrong. For now, I will try to track down which version of CUDA is installed and build an example from_that_version to see if things are working.