GM107 + CUDA 6.0

I’m running Fedora 20, installed 334.21 first and CUDA 6.0 next. The samples seemed to compile without any errors but running the binaries fails:

cudaGetDeviceProperties returned 38
-> no CUDA-capable device is detected

Is there any obvious explanation for this error?

The CUDA Getting Started Guide states “The package manager installations (RPM/DEB packages) and the stand-alone installer installations (.run file) of the NVIDIA driver are incompatible. Before using the distribution-specific packages, uninstall the NVIDIA driver (…)”. Do I need to install the CUDA 6.0 toolkit first before I install the 334.21 driver?

what method did you use to install the driver?
what method did you use to install the CUDA 6RC?

Installed Fedora 20 without Nouveau and installed the proprietary nVidia driver at runlevel 3. Worked without any issues. After that I installed cuda-repo-fedora19-6-0-rc-6.0-26.x86_64.rpm as described in http://developer.download.nvidia.com/compute/cuda/6_0/rc/docs/CUDA_Getting_Started_Linux.pdf.

what method did you use to install the driver? run file or package method?
in any event, you should try just re-installing the 334.xx driver

If you want to reload the system, I would use the runfile methods, since they give you explicit control over the pieces that get installed.

Download the CUDA 6 RC runfile installer, as well as the 334.xx driver runfile installer, from the nvidia web site.

Install the 334.xx driver first.
verify proper install by running:
nvidia-smi -a

then install the CUDA 6 RC runfile installer, and select “no” when prompted to install the driver.

I used the 334.21 run file. I guess the nvidia-smi -a output looks OK because it’s a GeForce card:

==============NVSMI LOG==============

Timestamp                           : Tue Mar 11 20:13:33 2014
Driver Version                      : 334.21

Attached GPUs                       : 1
GPU 0000:01:00.0
    Product Name                    : GeForce GTX 750 Ti
    Display Mode                    : N/A
    Display Active                  : N/A
    Persistence Mode                : Disabled
    Accounting Mode                 : N/A
    Accounting Mode Buffer Size     : N/A
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-b2181ea5-4be6-406d-80b8-b2be2eefe7ec
    Minor Number                    : 0
    VBIOS Version                   : 82.07.25.00.52
    Inforom Version
        Image Version               : N/A
        OEM Object                  : N/A
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    PCI
        Bus                         : 0x01
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x138010DE
        Bus Id                      : 0000:01:00.0
        Sub System Id               : 0x31021462
        GPU Link Info
            PCIe Generation
                Max                 : N/A
                Current             : N/A
            Link Width
                Max                 : N/A
                Current             : N/A
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
    Fan Speed                       : 32 %
    Performance State               : N/A
    Clocks Throttle Reasons         : N/A
    FB Memory Usage
        Total                       : 2047 MiB
        Used                        : 206 MiB
        Free                        : 1841 MiB
    BAR1 Memory Usage
        Total                       : N/A
        Used                        : N/A
        Free                        : N/A
    Compute Mode                    : Default
    Utilization
        Gpu                         : N/A
        Memory                      : N/A
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Total               : N/A
            Double Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Total               : N/A
        Aggregate
            Single Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Total               : N/A
            Double Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        Gpu                         : 29 C
    Power Readings
        Power Management            : N/A
        Power Draw                  : N/A
        Power Limit                 : N/A
        Default Power Limit         : N/A
        Enforced Power Limit        : N/A
        Min Power Limit             : N/A
        Max Power Limit             : N/A
    Clocks
        Graphics                    : N/A
        SM                          : N/A
        Memory                      : N/A
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : N/A
        SM                          : N/A
        Memory                      : N/A
    Compute Processes               : N/A

When I installed the CUDA 6 RC run file no question came up about installing the driver.

The nvidia-smi output looks OK. Still having trouble?

Will try to install CUDA 6.0 RC again. First time I used the Fedora specific package and it didn’t ask me about the driver.

I just tried installing the CUDA 6.0 RC run file but currently I get the following error log:

Error: unsupported compiler: 4.8.2. Use --override to override this check.
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

Error: cannot find Toolkit in /usr/local/cuda-6.0

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installation Failed. Using unsupported Compiler.
Samples:  Cannot find Toolkit in /usr/local/cuda-6.0

Though the libraries are all installed…

Tried it with --override again:

Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-6.0
Samples:  Installed in $HOME/NVIDIA_CUDA-6.0_Samples, but missing recommended libraries

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 331.00 is required for CUDA 6.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Fedora 20 is not officially supported.

[url]https://developer.nvidia.com/cuda-pre-production[/url]

It’s working now:

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 750 Ti"
  CUDA Driver Version / Runtime Version          6.0 / 6.0
  CUDA Capability Major/Minor version number:    5.0
  Total amount of global memory:                 2047 MBytes (2146762752 bytes)
  ( 5) Multiprocessors, (128) CUDA Cores/MP:     640 CUDA Cores
  GPU Clock rate:                                1137 MHz (1.14 GHz)
  Memory Clock rate:                             2700 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.0, CUDA Runtime Version = 6.0, NumDevs = 1, Device0 = GeForce GTX 750 Ti
Result = PASS
Device 0: GeForce GTX 750 Ti
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			5626.1

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			6512.5

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			69572.5

Result = PASS

I installed freeglut-devel, libXi-devel, libXmu-devel and mesa-libGLU-devel which fixed the lib errors.

The OpenGL samples still give me the error below:

/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/4.8.2/../.
./../libGL.so when searching for -lGL
/usr/bin/ld: skipping incompatible /lib/libGL.so when searching for -lGL
/usr/bin/ld: skipping incompatible /usr/lib/libGL.so when searching for -lGL
/usr/bin/ld: cannot find -lGL
collect2: error: ld returned 1 exit status
make: *** [simpleGL] Error 1

Posted the solution in a separate thread: cannot find -lGL

fedora 19 neither since the nvidia driver is not ready for the latest 3.13 kernel:
unknown symbol acpi_os_wait_events_complete

i don’t understand why installing cuda has to be so much of a hassle

When it’s stated that fedora 19 is supported, the implication is that it’s supported for the version of kernel that is included in the fedora 19 distro. It does not mean it’s supported for any possible kernel you might upgrade to. Likewise, it’s supported for the version of gcc included with fedora 19, not any possible gcc you might upgrade to.

Fedora 20 being unsupported means:

  • it’s not been formally tested, so it may not work at all, and there is no sense that it will work in any fashion
  • there may be explicit checks (such as the one detecting the particular gcc version as indicated previously) that may prevent ordinary usage. In some cases, these checks can be worked around or overridden.

In general it’s just a relative confidence statement. Fedora 19 has been tested to some degree. No software is defect free or 100% compatible, that I know of.

no no no…,
the cuda installer does this for you… if you want it or not…

well… ok… it does NOT upgrade the kernel itself but it WILL upgrade the kernel header and devel packages to the latest version which makes it even more of a mess. (aaaaaarghhh)

when you finally have downgraded these packages to initial values… it still doesnt work… (aaaaaaaarghhhhhhh)

then you go look around for answers which you you dont get…
well… ok… you do get an answer… same one over and over again:
“we can’t help you if you don’t upgrade to the latest versions”
aaarrggghhhhhhhhh

why do you even say stuff like that… it means if i want to use cuda i cant upgrade to the latest kernel and my pc runs a unsecure and buggy kernel.

where i live that would mean that officially i can’t use online banking since if my bank account gets compromised the bank technically doesnt have to refund any stolen money since my pc was not fully upgraded.

NUTS

The CUDA .run file driver installer doesn’t update any kernel header files, although it will fail if the correct ones are not installed. If you are using some package method to install the driver, then it’s quite possible your package manager is doing any number of things. I don’t know what packages, repos, or method you are using or referring to.

My suggestion would be to use the .run file installer method if you’re having trouble with the package manager method.

I was trying to share information. Since it seems that all I’m doing is upsetting you, I’ll stop now.

I run 3.13.6 but I haven’t seen that error yet. When do you run into it?

Looks like my thread got a bit hijacked into the wrong direction… I’m really thankful for your help here!