I am trying to copy a struct to global memory but I’m not sure if I’m doing it correctly. When I used only a pointer to send the struct to the GPU, everything worked fine. Any comments would be welcome. I am posting the code and the size of the struct calculated at runtime.
My goal is to achieve faster traversal time for the octree.
The issue arises when I include this line in the code:
This addition causes the code to crash with the following error:
CMakeFiles/prismatic.dir/prismatic.cu.o: in function "__nv_cudaEntityRegisterCallback(void**)":
tmpxft_0029cce1_00000000-7_prismatic.cudafe1.cpp:(.text+0x649): relocation truncated to fit: R_X86_64_PC32 against ".bss"
What you posted looks like a snippet of your code, not the actual code. Without a complete self-contained reproducer code that others can compile, the error message shown suggests your problem is in host code, where there is a huge statically allocated data object (e.g. double my_huge_array [100000000];). So large that 32-bit offsets into the BSS segment are not sufficient to address all of it.
If this diagnosis matches what is happening in your code: Don’t do that. Allocate large data objects on the heap, via malloc().
Allocating your large data object dynamically with malloc() should not have a negative impact on the traversal time of the octree (unless you are doing something special that you have not told us about yet).