Cuda code not working on linux

Hi! I am completly new to CUDA and I’m working on a debian linux wheezy/sid. I am not the administrator of the machine and I have no possibility of installing my own applications. The administrator installed everything related to cuda from the debian repository:

p   boinc-nvidia-cuda                                   - metapackage for CUDA-savvy BOINC client and manager          

p   lib32cudart4                                        - NVIDIA CUDA runtime library (32-bit)                         

i   libcuda1                                            - NVIDIA CUDA runtime library                                  

i   libcuda1-ia32                                       - NVIDIA CUDA runtime library (32-bit)                         

idA libcudart3                                          - NVIDIA CUDA runtime library                                  

i A libcudart4                                          - NVIDIA CUDA runtime library                                  

i   nvidia-cuda-dev                                     - NVIDIA CUDA development files                                

i A nvidia-cuda-doc                                     - NVIDIA CUDA and OpenCL documentation                         

i   nvidia-cuda-gdb                                     - NVIDIA CUDA GDB                                              

i   nvidia-cuda-toolkit                                 - NVIDIA CUDA toolkit

and done nothing more about it.

When we download a precompiled code from the internet it works great on the machine using the CUDA processors, but when i compile (nvcc -lcudart) on my own an easy code such as this (modified from the book “CUDA by Example”:

#include <stdio.h>

#include <cuda.h>

#include <cuda_runtime.h>

//#include "../common/book.h"

__global__ void add( int a, int b, int *c ) {

    *c = a + b;


int main( void ) {

    int c;

    int *dev_c;

    int dev_no;


    printf("Devices: %d\n", dev_no);

    cudaMalloc( (void**)&dev_c, sizeof(int) ) ;

ad<<<1,1>>>( 2, 7, dev_c );

cudaMemcpy( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost ) ;

    printf( "2 + 7 = %d\n", c );

    cudaFree( dev_c ) ;

return 0;


and try to run it - i get a result:

Devices: 0

2 + 7 = 0

When there should be one device and the result should be nine.

Can anybody help me? Should I tell my administrator to modify something with the linux or am I doing something wrong?

I am also trying to run the same code, depending on what I use to compile I get a different answer, If I am just using my computer I get 0 but if I use the Udacity online cuda compiler I get 9?! HELP

Add proper cuda error checking to your code. (If you don’t know what that is, google proper cuda error checking, and take the first hit. )

Also run:

nvidia-smi -a

from the command prompt.

The results of the above two steps will move you closer to understanding why your machine is not working.