Hi folks,
I’m learning the samples on programming guide, and wrote a simple kernel to verify the function in title allocated aligned addresses. But I can’t get what I wanted.
global void kernel_aligned_malloc(uint8_t *ptr, size_t size, size_t align, int loop) {
uint8_t *dptr;
auto is_aligned = [&](uint8_t p) {return ((uint64_t)(p) & (align - 1)) == 0;};
for (int l = 1; l <= loop; l ++) {
dptr = (uint8_t)__nv_aligned_device_malloc(size, align);
memset(dptr, (l & 0xff), size);
if (!is_aligned(dptr)) {
ptr[0] = uint8_t((uint64_t)(dptr) & 0xff);
free(dptr);
return;
}
memcpy(ptr, dptr, size);
free(dptr);
}
}
Then I used the cuda-memcheck, it said the below errors. Did I do something wrong?
========= Malloc/Free error encountered : Double free
========= at 0x00000d40 in __cuda_syscall_mc_dyn_globallock_free
========= by thread (0,0,0) in block (0,0,0)
========= Address 0x7ff8b2dff920
========= Program hit cudaErrorLaunchFailure (error 719) due to “unspecified launch failure” on CUDA API call to cudaMemcpy.
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so.1 [0x34fb13]
========= Host Frame:./build/test_heap_memory_aligned_malloc [0x9a845]
========= Host Frame:./build/test_heap_memory_aligned_malloc [0xeb45]
========= Host Frame:./build/test_heap_memory_aligned_malloc [0xdf07]
========= Host Frame:./build/test_heap_memory_aligned_malloc [0xe0a0]
========= Host Frame:./build/test_heap_memory_aligned_malloc [0x3506c]
========= Host Frame:./build/test_heap_memory_aligned_malloc [0x2fe21]
========= Host Frame:./build/test_heap_memory_aligned_malloc [0x14fd6]