nvlink errors using dynamic parallelism with CUDA 9.1 on Tesla V100/Ubuntu 18.04

gamingacfb3 · December 16, 2019, 3:06pm

I’m diving into CUDA at the moment and was trying to use dynamic parallelism on my remote machine which is running Ubuntu 18.04 LTS with 4 Tesla V100 GPUs.

My code looks as follows (“slightly” modified):

#define LSIZE 5
#define RSIZE 5
#define LENGTH 5

// ...

__global__ void hammingDistance(const bool* left, const size_t size_l, const bool* right, const size_t size_r, int* out)
{
	if (size_l != size_r) {
		*out = -1;
		return;
	}

	for (int i = 0; i < size_l; ++i) {
		*out += left[i] ^ right[i];
	}
}

__global__ void executeMatching(bool** leftDescriptorSet, bool** rightDescriptorSet)
{
	for (size_t iLeft = 0; iLeft < LSIZE; ++iLeft) {
		bool* lDesc = leftDescriptorSet[iLeft];

		for (size_t iRight = 0; iRight < RSIZE; ++iRight) {
			bool* rDesc = rightDescriptorSet[iRight];
            
			int* sum = new int(0);
            
			hammingDistance<<<1, 1>>>(lDesc, LSIZE, rDesc, RSIZE, sum);

			// ...
		}
	}
}

// ...

int main() {
        // ...

	// example data
        bool *dev_aSetPtr, *dev_bSetPtr;
        cudaMallocManaged(&dev_aSetPtr, LSIZE * sizeof(bool));
	cudaMallocManaged(&dev_bSetPtr, RSIZE * sizeof(bool));

        // ...

	executeMatching<<<1, 1>>>(&dev_aSetPtr, &dev_bSetPtr);

        // ...
}

When compiling using

/usr/bin/nvcc /home/tibor/cuda_hm/hamming_matcher.cu -o /home/tibor/cuda_hm/hamming_matcher -gencode arch=compute_70,code=sm_70 -rdc=true

I keep getting an error:

nvlink error   : Undefined reference to 'cudaGetParameterBufferV2' in '/tmp/tmpxft_00006ca8_00000000-10_hamming_matcher.o'
nvlink error   : Undefined reference to 'cudaLaunchDeviceV2' in '/tmp/tmpxft_00006ca8_00000000-10_hamming_matcher.o'
The terminal process terminated with exit code: 255

Is there something wrong with my CUDA Toolkit installation?

Robert_Crovella · December 16, 2019, 3:51pm

Add

-lcudadevrt

to the end of your compile command line.

any dynamic parallelism CUDA sample code/project will also have a makefile that shows what is needed

gamingacfb3 · December 17, 2019, 10:40am

That didn’t change anything unfortunately.

I tried using the same code on another machine with Windows 10 / VS 2019 and CUDA 10.2, linking cudadevrt.lib into the project which made it work like a charm, unfortunately I’m not getting it set up in the Linux dev env.

Robert_Crovella · December 17, 2019, 4:16pm

possibly a mismatched or corrupted linux install.

I shudder every time I see people using

/usr/bin/nvcc

As I prefer to only use

/usr/local/cuda/bin/nvcc

gamingacfb3 · December 18, 2019, 10:17am

Well regardless of the path used, the solution was using the CUDA 10.0 compiler which made it work.

Topic		Replies	Views
dynamic parallelism CUDA Programming and Performance	4	471	July 26, 2019
nvlink error when compiling CUDA code in linux Announcements	0	1404	February 15, 2019
Nvlink seems not to link for cuda libraries if cross compiling and --cpu-arch=AARCH64 is specified General driveos-cuda	6	2254	November 9, 2021
Dynamic Parallelism on TX1 Jetson TX1	3	2506	April 28, 2016
Cross compiling dynamic parallelism for jetson aarch64 cuda 10.2 nvlink error Jetson TX2 cuda , compile	5	1281	December 1, 2021
Nvcc only partially respects CUDA_HOME ("Input file newer than toolkit") CUDA Setup and Installation	0	2539	July 2, 2021
Compiling programs that use dynamic parallelism (in Thrust) with device link time optimization CUDA Programming and Performance cuda , nvcc	1	169	May 31, 2024
CUDA Compilation Issues CUDA Programming and Performance	2	3439	September 16, 2009
Nvcc and nvlink error CUDA Programming and Performance cuda , nvcc	4	712	November 29, 2023
Nvlink error : Undefined reference to 'cublasZgemm_v2' in ******.obj' GPU-Accelerated Libraries cublas	18	1928	July 29, 2024

nvlink errors using dynamic parallelism with CUDA 9.1 on Tesla V100/Ubuntu 18.04

Related topics