The program works properly but I want to use in cpp function so I made in Qtcreator a project width the follow .pro file:
TEMPLATE = app
CONFIG += console
CONFIG -= qt
SOURCES += src/main.cpp
CONFIG += link_pkgconfig
PKGCONFIG += opencv
INCLUDEPATH += /usr/local/include
INCLUDEPATH += /usr/local/include/opencv
LIBS += -L/usr/local/lib
LIBS += -L/usr/lib/x86_64-linux-gnu
LIBS += -L/usr/local/share/OpenCV/3rdparty/lib
LIBS += -lm
LIBS += -lopencv_core
LIBS += -lopencv_imgproc
LIBS += -lopencv_highgui
LIBS += -lopencv_objdetect
LIBS += -lopencv_calib3d
LIBS += -lGL -lGLU -lX11 -lglut -lGLEW
# CUDA settings <-- may change depending on your system
CUDA_SOURCES += ./cuda/Convert.cu
CUDA_SDK = /usr/lib/nvidia-cuda-toolkit #/usr/include/ # Path to cuda SDK install
CUDA_DIR = /usr/lib/nvidia-cuda-toolkit # Path to cuda toolkit install
# DO NOT EDIT BEYOND THIS UNLESS YOU KNOW WHAT YOU ARE DOING....
SYSTEM_NAME = unix # Depending on your system either 'Win32', 'x64', or 'Win64'
SYSTEM_TYPE = 64 # '32' or '64', depending on your system
CUDA_ARCH = sm_52 # Type of CUDA architecture, for example 'compute_10', 'compute_11', 'sm_10'
NVCC_OPTIONS = #--use_fast_math
# include paths
INCLUDEPATH += $CUDA_DIR/include
INCLUDEPATH += $CUDA_DIR/
# library directories
QMAKE_LIBDIR += /usr/lib/x86_64-linux-gnu#/usr/lib/nvidia-cuda-toolkit/lib #/usr/lib/i386-linux-gnu #$CUDA_DIR/lib/
CUDA_OBJECTS_DIR = ./
# Add the necessary libraries
CUDA_LIBS = -lcuda -lcudart -lnppi -lnpps
# The following makes sure all path names (which often include spaces) are put between quotation marks
CUDA_INC = $join(INCLUDEPATH,'" -I"','-I"','"')
LIBS += -L /usr/lib/x86_64-linux-gnu -lcuda -lcudart -lnppi -lnpps
NVCC_LIBS = -lGL -lGLU -lX11 -lglut -lGLEW
# Release mode
cuda.input = CUDA_SOURCES
cuda.output = $CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
cuda.commands = $CUDA_DIR/bin/nvcc -dlink $NVCC_OPTIONS $CUDA_INC $NVCC_LIBS --machine $SYSTEM_TYPE -gencode arch=compute_52,code=sm_52 -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
cuda.dependency_type = TYPE_C
QMAKE_EXTRA_COMPILERS += cuda
cudaLINK.input = CUDA_SOURCES
cudaLINK.output = $CUDA_OBJECTS_DIR/${TARGET}_cuda.o
cudaLINK.commands = $CUDA_DIR/bin/nvcc -ccbin g++ -dlink $NVCC_OPTIONS $CUDA_INC $NVCC_LIBS --machine $SYSTEM_TYPE -gencode arch=compute_52,code=sm_52 Convert_cuda.o -o ${TARGET}_cuda.o
QMAKE_EXTRA_COMPILERS += cudaLINK
HEADERS += \
cuda/Global_var.h \
cuda/Convert.h \
The program send me :
/BGE/cuda/Convert.cu(60) : CUDA Runtime API error 8: invalid device function.
The problem is the same program works properly on my other computer. I don’t understand why in the first computer the program compiling tools works fin on one computer and not on the other.
From what is shown above, you seem to build only for am sm_52/compute_52 platform. If the resulting code runs on one machine, but fails to run on a second one with the error message shown, it suggests that the second machine has a GPU with compute capability < 5.2. What GPUs are in your two systems?
It seems to me that this is not the optimal solution. It seems that one of your devices is indeed an sm_52 device, the other is some other architecture. Therefore, the second device cannot execute the sm_52 machine code generated with ‘code=sm_52’. When you switch to ‘code=compute_52’ the generated PTX (which is not machine code) can be JIT compiled on the second device, and the program runs fine. However, JIT compilation creates overhead at application startup, and depending on the details, it could be significant overhead.
The better way to deal with the situation is to find out the architecture(s) of all the GPUs you intend to run on, then have the compiler build what is called a “fat” binary that includes machine code for all the architectures that you want to target. Building fat binaries is a best practice of CUDA programming.