I encountered an issue using the create_1d_tensor_map
function in CUDA while increasing my tensor dimensions. When I modified the size from 1024*1024*1024
to 2048*1024*1024
, the following error occurred:
bin: ./headers/host/tma_tensor_map.cuh:51: CUtensorMap create_1d_tensor_map(uint64_t, uint32_t, void*): Assertion `res == CUDA_SUCCESS && "tensormap creation failed."' failed.
Aborted (core dumped)
I’m wondering if create_1d_tensor_map
has any undocumented limitations regarding input size. It seems unusual, as the function worked fine with smaller tensor sizes. However, I could not locate specific documentation on this function’s input constraints.
Could you please clarify if there are size restrictions for create_1d_tensor_map
, or if there might be another issue at play here? Any guidance or documentation links would be greatly appreciated!