CUDA, Linux Ubuntu 10.04 and strange mismatch version

Hi !

Sorry if another guy has previously asked about this problem, I try to found a solution in this forums, but I haven’t find any thread with solution.

So, this my problem.

I try to install CUDA 3.2 on my Linux Ubuntu 10.4. All is correctly installed (I suppose) and when I try to run any software with CUDA, I see a wonderful cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

So, this is my process:

First, I’ve downloaded this packages:

cudatoolkit_3.1_linux_32_ubuntu9.10.run

devdriver_3.1_linux_32_256.40.run

gpucomputingsdk_3.1_linux.run

To be sure, I’ve removed all ubuntu packages with nvidia

$ dpkg -l | grep nvidia

  $ _

After that, I’ve installed CUDA Tool Kit, and everything is ok.

I’ve installed devdriver 3.1: Ok too;

This is my system after reboot:

$ lsmod | grep nvidia

nvidia			  10190272  28 

agpgart				35408  1 nvidia
$ dmesg | tail -n2

[ 2819.693387] nvidia 0000:00:05.0: setting latency timer to 64

[ 2819.693699] NVRM: loading NVIDIA UNIX x86 Kernel Module  256.40  Wed Jul  7 12:54:34 PDT 2010
$ grep -i NVIDIA /var/log/Xorg.0.log

(--) PCI:*(0:0:5:0) 10de:0241:103c:2a54 nVidia Corporation C51 [GeForce 6150 LE] rev 162, Mem @ 0xfc000000/16777216, 0xe0000000/268435456, 0xfb000000/16777216, BIOS @ 0x????????/131072

(II) Module glx: vendor="NVIDIA Corporation"

(II) NVIDIA GLX Module  256.40  Wed Jul  7 13:18:54 PDT 2010

(II) LoadModule: "nvidia"

(II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so

(II) Module nvidia: vendor="NVIDIA Corporation"

(II) NVIDIA dlloader X Driver  256.40  Wed Jul  7 12:56:35 PDT 2010

(II) NVIDIA Unified Driver for all Supported NVIDIA GPUs

(**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32

(==) NVIDIA(0): RGB weight 888

(==) NVIDIA(0): Default visual is TrueColor

(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)

(**) Aug 03 17:48:28 NVIDIA(0): Enabling RENDER acceleration

(II) Aug 03 17:48:28 NVIDIA(0): Support for GLX with the Damage and Composite X extensions is

(II) Aug 03 17:48:28 NVIDIA(0):	 enabled.

(II) Aug 03 17:48:32 NVIDIA(0): NVIDIA GPU GeForce 6150 LE (C51) at PCI:0:5:0 (GPU-0)

(--) Aug 03 17:48:32 NVIDIA(0): Memory: 524288 kBytes

(--) Aug 03 17:48:32 NVIDIA(0): VideoBIOS: 05.51.28.54.21

(--) Aug 03 17:48:32 NVIDIA(0): Interlaced video modes are supported on this GPU

(--) Aug 03 17:48:32 NVIDIA(0): Connected display device(s) on GeForce 6150 LE at PCI:0:5:0:

(--) Aug 03 17:48:32 NVIDIA(0):	 HSD Hanns.G HW191 (CRT-0)

(--) Aug 03 17:48:32 NVIDIA(0): HSD Hanns.G HW191 (CRT-0): 350.0 MHz maximum pixel clock

(II) Aug 03 17:48:32 NVIDIA(0): Assigned Display Device: CRT-0

(==) Aug 03 17:48:32 NVIDIA(0): 

(==) Aug 03 17:48:32 NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"

(==) Aug 03 17:48:32 NVIDIA(0):	 will be used as the requested mode.

(==) Aug 03 17:48:32 NVIDIA(0): 

(II) Aug 03 17:48:32 NVIDIA(0): Validated modes:

(II) Aug 03 17:48:32 NVIDIA(0):	 "nvidia-auto-select"

(II) Aug 03 17:48:32 NVIDIA(0): Virtual screen size determined to be 1440 x 900

(--) Aug 03 17:48:32 NVIDIA(0): DPI set to (89, 87); computed from "UseEdidDpi" X config

(--) Aug 03 17:48:32 NVIDIA(0):	 option

(==) Aug 03 17:48:32 NVIDIA(0): Enabling 32-bit ARGB GLX visuals.

(II) Aug 03 17:48:32 NVIDIA(0): Initialized GPU GART.

(II) Aug 03 17:48:32 NVIDIA(0): Setting mode "nvidia-auto-select"

(II) Aug 03 17:48:32 NVIDIA(0): Initialized OpenGL Acceleration

(==) NVIDIA(0): Disabling shared memory pixmaps

(II) Aug 03 17:48:32 NVIDIA(0): Initialized X Rendering Acceleration

(==) NVIDIA(0): Backing store disabled

(==) NVIDIA(0): Silken mouse enabled

(**) NVIDIA(0): DPMS enabled

My ld.so.conf is OK:

$ cat /etc/ld.so.conf.d/cuda.conf 

/usr/local/cuda/lib/

Library and headers are correctly installed :

$ tree -f /usr/local/cuda/ | grep -v "man\|doc\|computeprof"

/usr/local/cuda

├── /usr/local/cuda/bin

│ �  ├── /usr/local/cuda/bin/bin2c

│ �  ├── /usr/local/cuda/bin/cudafe

│ �  ├── /usr/local/cuda/bin/cudafe++

│ �  ├── /usr/local/cuda/bin/cuda-gdb

│ �  ├── /usr/local/cuda/bin/cuda-memcheck

│ �  ├── /usr/local/cuda/bin/fatbin

│ �  ├── /usr/local/cuda/bin/filehash

│ �  ├── /usr/local/cuda/bin/nvcc

│ �  ├── /usr/local/cuda/bin/nvcc.profile

│ �  └── /usr/local/cuda/bin/ptxas

├── /usr/local/cuda/include

│ �  ├── /usr/local/cuda/include/builtin_types.h

│ �  ├── /usr/local/cuda/include/channel_descriptor.h

│ �  ├── /usr/local/cuda/include/CL

│ �  │ �  ├── /usr/local/cuda/include/CL/cl_ext.h

│ �  │ �  ├── /usr/local/cuda/include/CL/cl_gl_ext.h

│ �  │ �  ├── /usr/local/cuda/include/CL/cl_gl.h

│ �  │ �  ├── /usr/local/cuda/include/CL/cl.h

│ �  │ �  ├── /usr/local/cuda/include/CL/cl_platform.h

│ �  │ �  └── /usr/local/cuda/include/CL/opencl.h

│ �  ├── /usr/local/cuda/include/common_functions.h

│ �  ├── /usr/local/cuda/include/crt

│ �  │ �  ├── /usr/local/cuda/include/crt/device_runtime.h

│ �  │ �  ├── /usr/local/cuda/include/crt/func_macro.h

│ �  │ �  ├── /usr/local/cuda/include/crt/host_runtime.h

│ �  │ �  └── /usr/local/cuda/include/crt/storage_class.h

│ �  ├── /usr/local/cuda/include/cublas.h

│ �  ├── /usr/local/cuda/include/cuComplex.h

│ �  ├── /usr/local/cuda/include/__cudaFatFormat.h

│ �  ├── /usr/local/cuda/include/cudaGL.h

│ �  ├── /usr/local/cuda/include/cuda_gl_interop.h

│ �  ├── /usr/local/cuda/include/cuda.h

│ �  ├── /usr/local/cuda/include/cuda_runtime_api.h

│ �  ├── /usr/local/cuda/include/cuda_runtime.h

│ �  ├── /usr/local/cuda/include/cuda_surface_types.h

│ �  ├── /usr/local/cuda/include/cuda_texture_types.h

│ �  ├── /usr/local/cuda/include/cudaVDPAU.h

│ �  ├── /usr/local/cuda/include/cuda_vdpau_interop.h

│ �  ├── /usr/local/cuda/include/cufft.h

│ �  ├── /usr/local/cuda/include/device_functions.h

│ �  ├── /usr/local/cuda/include/device_launch_parameters.h

│ �  ├── /usr/local/cuda/include/device_types.h

│ �  ├── /usr/local/cuda/include/driver_functions.h

│ �  ├── /usr/local/cuda/include/driver_types.h

│ �  ├── /usr/local/cuda/include/host_config.h

│ �  ├── /usr/local/cuda/include/host_defines.h

│ �  ├── /usr/local/cuda/include/math_constants.h

│ �  ├── /usr/local/cuda/include/math_functions_dbl_ptx1.h

│ �  ├── /usr/local/cuda/include/math_functions_dbl_ptx3.h

│ �  ├── /usr/local/cuda/include/math_functions.h

│ �  ├── /usr/local/cuda/include/sm_11_atomic_functions.h

│ �  ├── /usr/local/cuda/include/sm_12_atomic_functions.h

│ �  ├── /usr/local/cuda/include/sm_13_double_functions.h

│ �  ├── /usr/local/cuda/include/sm_20_atomic_functions.h

│ �  ├── /usr/local/cuda/include/sm_20_intrinsics.h

│ �  ├── /usr/local/cuda/include/surface_functions.h

│ �  ├── /usr/local/cuda/include/surface_types.h

│ �  ├── /usr/local/cuda/include/texture_fetch_functions.h

│ �  ├── /usr/local/cuda/include/texture_types.h

│ �  ├── /usr/local/cuda/include/vector_functions.h

│ �  └── /usr/local/cuda/include/vector_types.h

├── /usr/local/cuda/lib

│ �  ├── /usr/local/cuda/lib/libcublas.so -> libcublas.so.3

│ �  ├── /usr/local/cuda/lib/libcublas.so.3 -> libcublas.so.3.1.9

│ �  ├── /usr/local/cuda/lib/libcublas.so.3.1.9

│ �  ├── /usr/local/cuda/lib/libcudart.so -> libcudart.so.3

│ �  ├── /usr/local/cuda/lib/libcudart.so.3 -> libcudart.so.3.1.9

│ �  ├── /usr/local/cuda/lib/libcudart.so.3.1.9

│ �  ├── /usr/local/cuda/lib/libcufft.so -> libcufft.so.3

│ �  ├── /usr/local/cuda/lib/libcufft.so.3 -> libcufft.so.3.1.9

│ �  └── /usr/local/cuda/lib/libcufft.so.3.1.9

├── /usr/local/cuda/open64

│ �  ├── /usr/local/cuda/open64/bin

│ �  │ �  └── /usr/local/cuda/open64/bin/nvopencc

│ �  └── /usr/local/cuda/open64/lib

│ �	  ├── /usr/local/cuda/open64/lib/be

│ �	  ├── /usr/local/cuda/open64/lib/bec

│ �	  ├── /usr/local/cuda/open64/lib/gfec

│ �	  └── /usr/local/cuda/open64/lib/inline

└── /usr/local/cuda/src

	├── /usr/local/cuda/src/fortran.c

	├── /usr/local/cuda/src/fortran_common.h

	├── /usr/local/cuda/src/fortran.h

	├── /usr/local/cuda/src/fortran_thunking.c

	└── /usr/local/cuda/src/fortran_thunking.h

20 directories, 2260 files

So, I’m going to ~/NVIDIA_GPU_Computing_SDK.

All compilations in “C” directory are correctly done without any errors.

But, If I try to run deviceQuery, I have this:

~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release$ ./deviceQuery

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

FAILED

Press <Enter> to Quit...

-----------------------------------------------------------

I’ve tried differents installations, but none solution worked correctly.

I’ve read another threads in this forum and all propositions doesn’t works for me.

If you have any idea… :)

Note:

  1. my Nvidia is GeForce 6150 LE (C51) and probably not supported CUDA ( http://www.nvidia.com/object/cuda_gpus.html ), but I read on the net that it’s possible to have an emulation with CPU

  2. In source code, “FAILED CUDA Driver and Runtime version may be mismatched” is before the checking “supported card” message:

int deviceCount = 0;

		if (cudaGetDeviceCount(&deviceCount) != cudaSuccess) {

				shrLog("cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.\n");

				shrLog("\nFAILED\n");

				shrEXIT(argc, argv);

		}

	// This function call returns 0 if there are no CUDA capable devices.

	if (deviceCount == 0)

		shrLog("There is no device supporting CUDA\n");

So, If my system are correctly installed but my card isn’t supported, I should see “There is no device supporting CUDA” and not “Driver and Runtime version may be mismatched”

  1. My GCC is “gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)”

Hi Benjamin,

my first suspicion is that the devdriver 256 may not in fact be installed. The mismatch would be between some ubuntu default driver and Cuda. I had similar problems installing on 64-bit Ubuntu 10.04.

I suggest

  1. follow this guide http://filthypants.blogspot.com/2010/06/nv…buntu-1004.html
    NB i) if you have not had to halt x-windows and blacklist the rival gpu drivers listed, then you are unlikely to have installed devdriver 256.
  2. you must install the driver, then cuda, then the sdk in that order (from your post it seems you did the driver last).
  3. follow the each step of the Nvidia ‘Getting Started Guide Linux’.
    If you do this the deviceQuery and other Cuda test programs in the SDK should give ‘PASS’ results.

These are the steps that fixed my install problems. Beware that you may break x-windows if you try to skip steps.

I hope this helps.

Nick

Hi,
as i stated before on the other post regarding this problem, I have the same issue on Ubuntu 10.O4 64bits, but only when I compile my cuda code in 32bits. In 64bits everything is working fine.
At this point i think it’s definitely not an installation problem, but i am clueless …
It will be good to find a solution as this issue is very annoying.

For emulation you do not need a device driver, it a compilation flag. As you have suggested your card may not be supported by CUDA

Hi Nick,

Thanks for you help,

So, I’ve followed your guide, I’ve installed driver (devdriver package), after cuda (cudatoolkit) and then sdk (gpucomputingsdk).

I’ve rebooted computer

Kernel Driver loaded is 256.40

Module Driver (xorg) loaded is 256.40

Into NVIDIA_SDK*/C/, I’ve recompiled utils (make clean ; make)

And the same message appear :-(

(aka “FAILED CUDA Driver and Runtime version may be mismatched”)

PS: “The mismatch would be between some ubuntu default driver and Cuda.”, be careful, in my previous post, I indicated: “To be sure, I’ve removed all ubuntu packages with nvidia”

[codebox]deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

There is no device supporting CUDA

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 134564791, CUDA Runtime Version = 0.0, NumDevs = 0

PASSED

[/codebox]

Alllllllllllllllleliuaaaaaaah!!!

Ok, so, this is the solution: DON’T USE F**** CUDA 3.1 !

Download ONLY CUDA 3.0 and you must install everything like that:

$ gpucomputingsdk_3.0_linux.run

$ cudatoolkit_3.0_linux_32_ubuntu9.04.run

$ devdriver_3.0_linux_32_195.36.15.run

PS:

After that, If you see errors like that :

/usr/include/bits/mathcalls.h:350: error: inline function 'int __signbitf(float)' cannot be declared weak

/usr/include/bits/mathcalls.h:350: error: inline function 'int __signbitl(long double)' cannot be declared weak

/usr/include/bits/mathinline.h:38: error: inline function 'int __signbitf(float)' cannot be declared weak

/usr/include/bits/mathinline.h:50: error: inline function 'int __signbit(double)' cannot be declared weak

/usr/include/bits/mathinline.h:62: error: inline function 'int __signbitl(long double)' cannot be declared weak

make[1]: *** [obj/i386/release/fastWalshTransform.cu.o] Error 1

make[1]: Leaving directory `NVIDIA_GPU_Computing_SDK/C/src/fastWalshTransform'

make: *** [src/fastWalshTransform/Makefile.ph_build] Error 2

Compile only “deviceQuery*”:

Goes to “NVIDIA_GPU_Computing_SDK/C/src/deviceQuery” and type “make” (simply)

PS: Don’t forget to make previous/next steps (like ld.so.conf and LD_LIBRARY_PATH, BIN_PATH, etc…)

[codebox]deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

There is no device supporting CUDA

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 134564791, CUDA Runtime Version = 0.0, NumDevs = 0

PASSED

[/codebox]

Alllllllllllllllleliuaaaaaaah!!!

Ok, so, this is the solution: DON’T USE F**** CUDA 3.1 !

Download ONLY CUDA 3.0 and you must install everything like that:

$ gpucomputingsdk_3.0_linux.run

$ cudatoolkit_3.0_linux_32_ubuntu9.04.run

$ devdriver_3.0_linux_32_195.36.15.run

PS:

After that, If you see errors like that :

/usr/include/bits/mathcalls.h:350: error: inline function 'int __signbitf(float)' cannot be declared weak

/usr/include/bits/mathcalls.h:350: error: inline function 'int __signbitl(long double)' cannot be declared weak

/usr/include/bits/mathinline.h:38: error: inline function 'int __signbitf(float)' cannot be declared weak

/usr/include/bits/mathinline.h:50: error: inline function 'int __signbit(double)' cannot be declared weak

/usr/include/bits/mathinline.h:62: error: inline function 'int __signbitl(long double)' cannot be declared weak

make[1]: *** [obj/i386/release/fastWalshTransform.cu.o] Error 1

make[1]: Leaving directory `NVIDIA_GPU_Computing_SDK/C/src/fastWalshTransform'

make: *** [src/fastWalshTransform/Makefile.ph_build] Error 2

Compile only “deviceQuery*”:

Goes to “NVIDIA_GPU_Computing_SDK/C/src/deviceQuery” and type “make” (simply)

PS: Don’t forget to make previous/next steps (like ld.so.conf and LD_LIBRARY_PATH, BIN_PATH, etc…)

That has to be the worst solution I’ve ever seen, particularly since 3.1 was a huge improvement for fermi cards. I wouldn’t bring up this thread except that I now have the same problem. I’ve been running 3.1 with two GTX 480s since it came out, but I just setup a box with 4 Tesla C2050s and am having this problem. Both systems are running Ubuntu 10.04, but the former is desktop and the latter is server. I installed devdriver_3.1_linux_64_256.40.run, cudatoolkit_3.1_linux_64_ubuntu9.10.run, and gpucomputingsdk_3.1_linux.run in that order. Does anyone know why it won’t work?

That has to be the worst solution I’ve ever seen, particularly since 3.1 was a huge improvement for fermi cards. I wouldn’t bring up this thread except that I now have the same problem. I’ve been running 3.1 with two GTX 480s since it came out, but I just setup a box with 4 Tesla C2050s and am having this problem. Both systems are running Ubuntu 10.04, but the former is desktop and the latter is server. I installed devdriver_3.1_linux_64_256.40.run, cudatoolkit_3.1_linux_64_ubuntu9.10.run, and gpucomputingsdk_3.1_linux.run in that order. Does anyone know why it won’t work?

Hi,

I do have a similar problem. My hardware is a DL160 box connected to a tesla. I am using NVIDIA-Linux-x86_64-256.44.run for the driver and cudatoolkit_3.1_linux_64_ubuntu9.10.run

Both installations run smoothly and I have at the end:

[codebox]root@bndligpu03:~# lsmod

Module Size Used by

nvidia 11070680 0

nfs 309988 1 [/codebox]

and

[codebox]root@bndligpu03:~# lspci -v

0b:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060] (rev a1)

    Subsystem: nVidia Corporation Device 0595 

    Flags: bus master, fast devsel, latency 0, IRQ 24 

    Memory at f7000000 (32-bit, non-prefetchable)  

    Memory at d8000000 (64-bit, prefetchable)  

    Memory at f4000000 (64-bit, non-prefetchable)  

    I/O ports at cc00  

    Expansion ROM at f6f80000 [disabled]  

    Capabilities: [60] Power Management version 3 

    Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- 

    Capabilities: [78] Express Endpoint, MSI 00 

    Capabilities: [100] Virtual Channel <?> 

    Capabilities: [128] Power Budgeting <?> 

    Capabilities: [600] Vendor Specific Information <?> 

    Kernel driver in use: nvidia 

    Kernel modules: nvidia, nvidiafb, nouveau

0d:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060] (rev a1)

    Subsystem: nVidia Corporation Device 0595 

    Flags: bus master, fast devsel, latency 0, IRQ 35 

    Memory at fa000000 (32-bit, non-prefetchable)  

    Memory at dc000000 (64-bit, prefetchable)  

    Memory at f8000000 (64-bit, non-prefetchable)  

    I/O ports at dc00  

    Expansion ROM at fbd80000 [disabled]  

    Capabilities: [60] Power Management version 3 

    Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- 

    Capabilities: [78] Express Endpoint, MSI 00 

    Capabilities: [100] Virtual Channel <?> 

    Capabilities: [128] Power Budgeting <?> 

    Capabilities: [600] Vendor Specific Information <?> 

    Kernel driver in use: nvidia 

    Kernel modules: nvidia, nvidiafb, nouveau

[/codebox]

When I give a try to the BlackScholes app, I got:

[codebox]./BlackScholes

[BlackScholes]

./BlackScholes Starting…

Initializing data…

…allocating CPU memory for options.

…allocating GPU memory for options.

BlackScholes.cu(129) : cudaSafeCall() Runtime API error : unspecified driver error. [/codebox]

The other problem I have (lower on the priority list :mellow: ) is that if I want to have the console output (from the graphic card of my DL160) I need the Tesla to be disconnected … once connected no more graphical output, I have to remote connect.

Any help welcome …

Guillaume

Hi,

I do have a similar problem. My hardware is a DL160 box connected to a tesla. I am using NVIDIA-Linux-x86_64-256.44.run for the driver and cudatoolkit_3.1_linux_64_ubuntu9.10.run

Both installations run smoothly and I have at the end:

[codebox]root@bndligpu03:~# lsmod

Module Size Used by

nvidia 11070680 0

nfs 309988 1 [/codebox]

and

[codebox]root@bndligpu03:~# lspci -v

0b:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060] (rev a1)

    Subsystem: nVidia Corporation Device 0595 

    Flags: bus master, fast devsel, latency 0, IRQ 24 

    Memory at f7000000 (32-bit, non-prefetchable)  

    Memory at d8000000 (64-bit, prefetchable)  

    Memory at f4000000 (64-bit, non-prefetchable)  

    I/O ports at cc00  

    Expansion ROM at f6f80000 [disabled]  

    Capabilities: [60] Power Management version 3 

    Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- 

    Capabilities: [78] Express Endpoint, MSI 00 

    Capabilities: [100] Virtual Channel <?> 

    Capabilities: [128] Power Budgeting <?> 

    Capabilities: [600] Vendor Specific Information <?> 

    Kernel driver in use: nvidia 

    Kernel modules: nvidia, nvidiafb, nouveau

0d:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060] (rev a1)

    Subsystem: nVidia Corporation Device 0595 

    Flags: bus master, fast devsel, latency 0, IRQ 35 

    Memory at fa000000 (32-bit, non-prefetchable)  

    Memory at dc000000 (64-bit, prefetchable)  

    Memory at f8000000 (64-bit, non-prefetchable)  

    I/O ports at dc00  

    Expansion ROM at fbd80000 [disabled]  

    Capabilities: [60] Power Management version 3 

    Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- 

    Capabilities: [78] Express Endpoint, MSI 00 

    Capabilities: [100] Virtual Channel <?> 

    Capabilities: [128] Power Budgeting <?> 

    Capabilities: [600] Vendor Specific Information <?> 

    Kernel driver in use: nvidia 

    Kernel modules: nvidia, nvidiafb, nouveau

[/codebox]

When I give a try to the BlackScholes app, I got:

[codebox]./BlackScholes

[BlackScholes]

./BlackScholes Starting…

Initializing data…

…allocating CPU memory for options.

…allocating GPU memory for options.

BlackScholes.cu(129) : cudaSafeCall() Runtime API error : unspecified driver error. [/codebox]

The other problem I have (lower on the priority list :mellow: ) is that if I want to have the console output (from the graphic card of my DL160) I need the Tesla to be disconnected … once connected no more graphical output, I have to remote connect.

Any help welcome …

Guillaume

Well, I discovered a workaround (though not a solution) to my problem, namely, this.

~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release$ ./deviceQuery

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

FAILED

Press <Enter> to Quit...

-----------------------------------------------------------

If I run deviceQuery once as root, then it seems to work fine as a normal user after that, however, after restarting the system, I had to do the same thing again. Any thoughts?

Well, I discovered a workaround (though not a solution) to my problem, namely, this.

~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release$ ./deviceQuery

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

FAILED

Press <Enter> to Quit...

-----------------------------------------------------------

If I run deviceQuery once as root, then it seems to work fine as a normal user after that, however, after restarting the system, I had to do the same thing again. Any thoughts?

Hi, I had the same problem. It seems like there is no access to your graphics hardware allowed for your user. I solved it by changing the rights for the /dev/nv* - devices:

sudo chmod 666 /dev/nvidia0

sudo chmod 666 /dev/nvidiactl

I’m not sure if this is a good solution, at least it worked for me. Now deviceQuery (and all the other binaries I tested) seem to work without superuser privileges.

Thanks to all of you for your previous solutions. Helped me a lot!

(Please excuse my bad english!)

Hi, I had the same problem. It seems like there is no access to your graphics hardware allowed for your user. I solved it by changing the rights for the /dev/nv* - devices:

sudo chmod 666 /dev/nvidia0

sudo chmod 666 /dev/nvidiactl

I’m not sure if this is a good solution, at least it worked for me. Now deviceQuery (and all the other binaries I tested) seem to work without superuser privileges.

Thanks to all of you for your previous solutions. Helped me a lot!

(Please excuse my bad english!)

You problem might be related to left-over files from old driver.
If you first had installed 195.x (Ubuntu driver) it has installed libraries and put their metadata to /etc/ld.so.cache.
Then you install 256.x driver, which could overwrite some files (kernel and X driver) but not other (libraries), or it might overwrite libraries without updating ld.so.cache.
System then tries to get CUDA libraries, but either finds old versions (from 195, not 256 driver) or does not find them (if ldconfig is misconfigured)
This way, even thought you have new driver, CUDA does not work.

My advice - make sure to remove old driver (use --purge option of apt-get, or “Remove configuration files” in synaptic), then reinstall 256.x drivers, and make sure that ldconfig is configured correctly:
add all directories with NVIDIA libraries to /etc/ld.so.conf and run ldconfig as root.

Hope it helps.

You problem might be related to left-over files from old driver.
If you first had installed 195.x (Ubuntu driver) it has installed libraries and put their metadata to /etc/ld.so.cache.
Then you install 256.x driver, which could overwrite some files (kernel and X driver) but not other (libraries), or it might overwrite libraries without updating ld.so.cache.
System then tries to get CUDA libraries, but either finds old versions (from 195, not 256 driver) or does not find them (if ldconfig is misconfigured)
This way, even thought you have new driver, CUDA does not work.

My advice - make sure to remove old driver (use --purge option of apt-get, or “Remove configuration files” in synaptic), then reinstall 256.x drivers, and make sure that ldconfig is configured correctly:
add all directories with NVIDIA libraries to /etc/ld.so.conf and run ldconfig as root.

Hope it helps.

Please attach the output of nvidia-bug-report.sh. In addition, please attach the output of “strace -o cuda-strace.log deviceQuery”.

Please attach the output of nvidia-bug-report.sh. In addition, please attach the output of “strace -o cuda-strace.log deviceQuery”.

Hi,

… a bug ?

http://forums.nvidia.com/index.php?showtopic=181448

Norge