invalid device function

xav123 · November 11, 2016, 7:33am

Hello,

I setup my new computer on ubuntu and I get that strange error, invalid device function on a code that works on my other computer.

I don’t think the problem comestible from the installation because the examples from cuda work.

Can someone help me to solve this problem?

Robert_Crovella · November 11, 2016, 1:09pm

I think this has been pretty much answered in your cross-posting:

[url]http://stackoverflow.com/questions/40536274/cuda-runtime-api-error-8-invalid-device-function[/url]

xav12358 · November 14, 2016, 9:13am

In fact not…, here is my problem. I made a cu file with a main function inside I compile it:

/usr//bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=compute_52 -o Convert.o -c Convert.cu

/usr//bin/nvcc -ccbin g++   -m64      -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=compute_52 -o Convert Convert.o

The program works properly but I want to use in cpp function so I made in Qtcreator a project width the follow .pro file:

TEMPLATE = app
CONFIG += console
CONFIG -= qt

SOURCES +=   src/main.cpp


CONFIG += link_pkgconfig
PKGCONFIG += opencv


INCLUDEPATH += /usr/local/include
INCLUDEPATH += /usr/local/include/opencv
LIBS += -L/usr/local/lib
LIBS += -L/usr/lib/x86_64-linux-gnu
LIBS += -L/usr/local/share/OpenCV/3rdparty/lib
LIBS += -lm
LIBS += -lopencv_core
LIBS += -lopencv_imgproc
LIBS += -lopencv_highgui
LIBS += -lopencv_objdetect
LIBS += -lopencv_calib3d
LIBS +=  -lGL -lGLU -lX11 -lglut -lGLEW



# CUDA settings <-- may change depending on your system
CUDA_SOURCES += ./cuda/Convert.cu


CUDA_SDK = /usr/lib/nvidia-cuda-toolkit             #/usr/include/   # Path to cuda SDK install
CUDA_DIR = /usr/lib/nvidia-cuda-toolkit             # Path to cuda toolkit install

# DO NOT EDIT BEYOND THIS UNLESS YOU KNOW WHAT YOU ARE DOING....

SYSTEM_NAME = unix         # Depending on your system either 'Win32', 'x64', or 'Win64'
SYSTEM_TYPE = 64           # '32' or '64', depending on your system
CUDA_ARCH = sm_52          # Type of CUDA architecture, for example 'compute_10', 'compute_11', 'sm_10'
NVCC_OPTIONS = #--use_fast_math


# include paths
INCLUDEPATH += $CUDA_DIR/include
INCLUDEPATH += $CUDA_DIR/

# library directories
QMAKE_LIBDIR += /usr/lib/x86_64-linux-gnu#/usr/lib/nvidia-cuda-toolkit/lib #/usr/lib/i386-linux-gnu #$CUDA_DIR/lib/

CUDA_OBJECTS_DIR = ./

# Add the necessary libraries
CUDA_LIBS = -lcuda -lcudart -lnppi -lnpps

# The following makes sure all path names (which often include spaces) are put between quotation marks
CUDA_INC = $join(INCLUDEPATH,'" -I"','-I"','"')
LIBS += -L /usr/lib/x86_64-linux-gnu -lcuda -lcudart -lnppi -lnpps
NVCC_LIBS =  -lGL -lGLU -lX11 -lglut -lGLEW
    # Release mode
    cuda.input = CUDA_SOURCES
    cuda.output = $CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
    cuda.commands = $CUDA_DIR/bin/nvcc   -dlink $NVCC_OPTIONS $CUDA_INC $NVCC_LIBS    --machine $SYSTEM_TYPE -gencode arch=compute_52,code=sm_52 -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
    cuda.dependency_type = TYPE_C
    QMAKE_EXTRA_COMPILERS += cuda


    cudaLINK.input = CUDA_SOURCES
    cudaLINK.output = $CUDA_OBJECTS_DIR/${TARGET}_cuda.o
    cudaLINK.commands = $CUDA_DIR/bin/nvcc -ccbin g++ -dlink  $NVCC_OPTIONS $CUDA_INC $NVCC_LIBS --machine $SYSTEM_TYPE -gencode arch=compute_52,code=sm_52   Convert_cuda.o -o ${TARGET}_cuda.o
    QMAKE_EXTRA_COMPILERS += cudaLINK


HEADERS += \

    cuda/Global_var.h \
    cuda/Convert.h \

The program send me :

/BGE/cuda/Convert.cu(60) : CUDA Runtime API error 8: invalid device function.

The problem is the same program works properly on my other computer. I don’t understand why in the first computer the program compiling tools works fin on one computer and not on the other.

njuffa · November 14, 2016, 5:38pm

From what is shown above, you seem to build only for am sm_52/compute_52 platform. If the resulting code runs on one machine, but fails to run on a second one with the error message shown, it suggests that the second machine has a GPU with compute capability < 5.2. What GPUs are in your two systems?

xav12358 · November 16, 2016, 2:55pm

In fact I just change that:

-gencode arch=compute_52,code=sm_52

to that:

-gencode arch=compute_52,code=compute_52

An it works.

Thank you for the answer.

njuffa · November 16, 2016, 6:30pm

It seems to me that this is not the optimal solution. It seems that one of your devices is indeed an sm_52 device, the other is some other architecture. Therefore, the second device cannot execute the sm_52 machine code generated with ‘code=sm_52’. When you switch to ‘code=compute_52’ the generated PTX (which is not machine code) can be JIT compiled on the second device, and the program runs fine. However, JIT compilation creates overhead at application startup, and depending on the details, it could be significant overhead.

The better way to deal with the situation is to find out the architecture(s) of all the GPUs you intend to run on, then have the compiler build what is called a “fat” binary that includes machine code for all the architectures that you want to target. Building fat binaries is a best practice of CUDA programming.

Topic		Replies	Views
Invalid Device Kernel after upgrade to Cuda 10.x CUDA Programming and Performance	9	1663	May 9, 2019
Troubleshooting uncommon error 98 "invalid device function" CUDA Programming and Performance	3	2565	May 16, 2025
invalid device function CUDA Programming and Performance	2	3765	July 8, 2009
invalid device function, all CUDA-capable devices are busy or unavailable CUDA Programming and Performance	5	7755	July 6, 2013
unresolved external symbol _main referenced in function ___tmainCRTStartup CUDA Programming and Performance	7	9315	February 22, 2011
Compiling SDK on opensuse CUDA Programming and Performance	12	14120	August 21, 2009
cudaLaunchKernel returned status 98: invalid device function nvc, nvc++ and nvfortran	15	4199	January 31, 2023
RuntimeError: Error code: 98, reason: invalid device function CUDA Programming and Performance	7	1604	April 18, 2023
Problems with CUDA CUDA Programming and Performance	8	2836	December 3, 2012
invalid device function Legacy PGI Compilers	9	11985	October 4, 2011

invalid device function

Related topics