I have some data like
Since each thread in a block will access[y] and then [y] and so on, I decided to use texture to take advantage of texture cache.
so I wrote something like :
->kernel function that fills d_COSMAG
cudaMallocArray(&da_COSMAG, &t_COSMAG.channelDesc, NM, NM);
cudaMemcpy2DToArray(da_COSMAG, 0, 0, d_COSMAG, NMNMNVECsizeof(int2),NMNVECsizeof(int2),NMsizeof(int2)
status = cudaGetLastError();
and this returns :
invalid pitch argument
I saw on other post that the pitch is limited to 2^16 * 4 bytes, which means I can only create texture of size 262144 bytes. Is this right?
if I want to fragment my texture in many subtexture, would it solve the problem? if yes, how can I fragment this in a dynamic way? in general, I would like to be able to change the size of this table depending on input parameters.