Does create_1d_tensor_map in CUDA have input size limitations?

I encountered an issue using the create_1d_tensor_map function in CUDA while increasing my tensor dimensions. When I modified the size from 1024*1024*1024 to 2048*1024*1024, the following error occurred:

bin: ./headers/host/tma_tensor_map.cuh:51: CUtensorMap create_1d_tensor_map(uint64_t, uint32_t, void*): Assertion `res == CUDA_SUCCESS && "tensormap creation failed."' failed.
Aborted (core dumped)

I’m wondering if create_1d_tensor_map has any undocumented limitations regarding input size. It seems unusual, as the function worked fine with smaller tensor sizes. However, I could not locate specific documentation on this function’s input constraints.

Could you please clarify if there are size restrictions for create_1d_tensor_map, or if there might be another issue at play here? Any guidance or documentation links would be greatly appreciated!

2048*1024*1024 will overflow.

1 Like

@striker159
You mean the number, not the call, and it should be replaced with 2048LL*1024LL*1024LL to have long long int literals?

Also is create_1d_tensor_map a Cuda API call? Or would it internally call CUDA Driver API :: CUDA Toolkit Documentation? Does it use the uint64_t parameter for the dimension?

1 Like