2D texture problem

I have some data like

double data[64][64][64];

Since each thread in a block will access[0][y] and then [1][y] and so on, I decided to use texture to take advantage of texture cache.

so I wrote something like :

[codebox]cudaArray *da_COSMAG;

texture<int2,2> t_COSMAG;

NM=64;NVEC=64;

->kernel function that fills d_COSMAG

cudaMallocArray(&da_COSMAG, &t_COSMAG.channelDesc, NM, NM);

cudaBindTextureToArray(t_COSMAG, da_COSMAG);

cudaMemcpy2DToArray(da_COSMAG, 0, 0, d_COSMAG, NMNMNVECsizeof(int2),NMNVECsizeof(int2),NMsizeof(int2)

, cudaMemcpyDeviceToDevice);

status = cudaGetLastError();

[/codebox]

and this returns :

invalid pitch argument

I saw on other post that the pitch is limited to 2^16 * 4 bytes, which means I can only create texture of size 262144 bytes. Is this right?

if I want to fragment my texture in many subtexture, would it solve the problem? if yes, how can I fragment this in a dynamic way? in general, I would like to be able to change the size of this table depending on input parameters.

thanks