Hello, trying to run some code that uses gup vram on the device, I get the error
Cuda error: Kernel execution failed in file ‘CUDAprog.cu’ in line 54 : out of memory.
the cubin lists the following requirements for the function
code {
name = muFync
lmem = 163840
smem = 28
reg = 28
bar = 0
bincode {…}
}
why would I run out of memory? I’m using less than 256 threads so it’s not the register usage either