Hi, it seems that cuBLAS uses signed int32 for indexes so it’s impossible to create Tensors bigger than 2^31-1 elements.
with float16 biggest possible tensor occupies less than 4G of GPU memory.
Hi, it seems that cuBLAS uses signed int32 for indexes so it’s impossible to create Tensors bigger than 2^31-1 elements.
with float16 biggest possible tensor occupies less than 4G of GPU memory.