Device malloc cause error

alexey.mitrov · January 1, 2025, 6:56pm

I am using BigInteger math in my program. And when I am increasing number of elements to process I am running into next issue:
========= Invalid global write of size 4 bytes
========= at CudaBI_FindProbablePrimes(unsigned char *, int, int, unsigned char *)+0x138 in /home/om/projects/cuda-rsa-generator/cuda/cuda_operations.cu:743
========= by thread (319,0,0) in block (24,0,0)
========= Address 0x200000000 is out of bounds
========= and is 113 246 208 bytes before the nearest allocation at 0x206c00000 of size 2 000 027 904 bytes

Code that cause it (result.value[0] = input;):
device CudaBI CudaBI_FromInt(int sign, uint32_t input) {
CudaBI result;
result.sign = sign;
result.length = 1;
result.value = (uint32_t*) malloc(sizeof(uint32_t) * 1);
result.value[0] = input;
return result;
}

CudaBI_FromInt called in loop. The error only present when I am allocating 10000 or more elements.

Here is how my structure defined:
struct CudaBI
{
int sign;
int length;
uint32_t* value;
};

I do call free(number.value);
I think that malloc return invalid address for some of itterations.
What else could it be?

Curefab · January 1, 2025, 7:02pm

Can you confirm by testing the result of malloc against 0x200000000?

You can also try cudaMalloc on the device, which returns an error code.

alexey.mitrov · January 1, 2025, 7:23pm

I try to use cudaMalloc + cudaFree so my code now looks like this:
device CudaBI CudaBI_FromInt(int sign, uint32_t input) {
CudaBI result;
result.sign = sign;
result.length = 1;
cudaMalloc((void**)&result.value, malloc(sizeof(uint32_t) * 1);
result.value[0] = input;
return result;
}

cudaFree(number.value);

And I got next error:
========= Program hit cudaErrorLaunchFailure (error 719) due to “unspecified launch failure” on CUDA API call to cudaMemcpy.
========= Saved host backtrace up to driver entry point at error
========= Host Frame: [0x4466f5]
========= in /lib/x86_64-linux-gnu/libcuda.so.1
========= Host Frame:cudaMemcpy [0x701d9]
========= in /home/om/projects/cuda-rsa-generator/cuda/libcudarsagen
========= Host Frame:main [0xbc42]
========= in /home/om/projects/cuda-rsa-generator/cuda/libcudarsagen
========= Host Frame:__libc_start_call_main in ../sysdeps/nptl/libc_start_call_main.h:58 [0x29d8f]
========= in /lib/x86_64-linux-gnu/libc.so.6
========= Host Frame:__libc_start_main in ../csu/libc-start.c:392 [0x29e3f]
========= in /lib/x86_64-linux-gnu/libc.so.6
========= Host Frame:_start [0xab24]
========= in /home/om/projects/cuda-rsa-generator/cuda/libcudarsagen

Robert_Crovella · January 2, 2025, 3:49pm

That is certainly possible and is the only way that device malloc has to signal an error, such as out of memory.

There are numerous web questions around the use of device malloc (example).

as a diagnostic, check that the returned pointer value is not zero (NULL pointer). That is how device malloc signals an error.
the device malloc allocates from a limited space (“device heap”) which is by default not the same size as your device memory. By default it is 8MB
you can adjust the available space with a runtime API call in host code. (cudaDeviceSetLimit(cudaLimitMallocHeapSize, size);)
repeated malloc/free on the device heap can result in fragmentation
there is an allocation granularity. Repeated small allocations may use substantially more memory than you would predict

Topic		Replies	Views
cuda-memcheck: error writing to device allocated memory CUDA Programming and Performance	2	1452	December 25, 2010
Unable to allocate memory on the GPU? CUDA Programming and Performance	1	3350	August 10, 2009
cudaMalloc error CUDA Programming and Performance	0	7318	March 16, 2010
Problem with cudamalloc free(): invalid next size (normal) CUDA Programming and Performance	3	2625	July 25, 2017
problem with cudaMalloc and free(): CUDA Programming and Performance	1	1001	April 30, 2009
Not working correctly new () and malloc () inside the kernel, why? CUDA Programming and Performance	2	1313	April 4, 2014
unspecified driver error in cudaMalloc CUDA Programming and Performance	0	2910	February 25, 2008
cuMemAlloc throw CUDA_ERROR_ILLEGAL_ADDRESS. CUDA Programming and Performance	1	1367	March 13, 2018
Solved: Memory Allocation Problems CUDA Programming and Performance	2	4247	September 7, 2015
memory allocated by cudaMalloc shows ??? as value. CUDA Programming and Performance	2	1864	November 6, 2013

Device malloc cause error

Related topics