Calling a CUDA kernel with buffers multiples times causes "Invalid argument error ID 1"

sajal · April 6, 2021, 10:58pm

I am implementing an MPI master-worker based program. The master program has a stream of tasks and it assigns a task to a worker as soon as the worker is available. Each worker process runs on one node with six GPUs. So, it launches kernels to these six GPUs with some buffer as arguments. The first time each worker is assigned a task it completes the task perfectly. But then when it’s being assigned a task for the second time, I am getting “Invalid argument error ID 1”. If I change the kernel to only work with scalar values (say, int num_of_values) instead of a buffer (say, int* data), there are no issues. Can someone suggest what might be the issue?

Before launching the kernel I am creating the device buffer using cudaMalloc() and freeing the buffer using cudaFree().

In high level:

master_process(){
while(there is more task)
send_task_to(worker w)
}

worker_process(){
receive_a_task(task)
compute_on_gpu(task)
}

compute_on_gpu(task){
int ngpus = 6;
long data_size = 10000 * 10000 / 2;
MyData *data = new MyData[ngpus];
for( int i = 0; i < ngpus; i++) {
cudaSetDevice(i);
gpuErrorCheck( cudaMalloc( &comb_d[i] , sizeof( MyData) * data_size) ;
myKernel<<<num_blocks, num_of_threads>>>(comb_d);
);

}

Topic		Replies	Views
What causes "Invalid configuration" at kernel launch? CUDA Programming and Performance	0	1123	June 16, 2009
'Invalid argument' error in a cudaMemset on 2 GPU configuration in multithreaded application CUDA Programming and Performance	3	3617	November 7, 2017
invalid argument error CUDA Programming and Performance	10	17591	March 25, 2009
Invalid configuration argument Kernels fail to work with big arrays CUDA Programming and Performance	2	9596	October 6, 2008
invalid argument with kernel execution CUDA Programming and Performance	1	3976	December 17, 2009
Encountering cudaErrorInvalidValue (error 11) although parameters for kernel call seem fine CUDA Programming and Performance	1	1174	October 24, 2017
"invalid argument" error unexplained "invalid argument" error CUDA Programming and Performance	3	5799	March 2, 2007
Pass pointer to class as a kernel argument and access class methods CUDA Programming and Performance	1	3451	July 5, 2018
cudaMemcpy: invalid argument issue CUDA Programming and Performance	2	16896	September 5, 2011
Run several iterations of kernel - multiGPU cudaLaunch cudaSetupArgument cudaConfigureCall CUDA Programming and Performance	0	3598	April 10, 2010

Calling a CUDA kernel with buffers multiples times causes "Invalid argument error ID 1"

Related topics