Problems creating long4 arrays

Hi all. I am trying to allocate a cudaArray of long4 but it keeps on telling me that my channel description is invalid. I have searched the documentation for hints on whether a higher computing capability is required but nothing comes up. I am using a GTX 580 on a 64-bit Linux with driver version 304.51 under CUDA 4.2.9. A minimal code snippet that fails for me is as follows:

#include <cuda_runtime.h>
#include <cstdio>

int main (int, char**) {
    const cudaChannelFormatDesc sumsFormat = cudaCreateChannelDesc<long4>();
    cudaArray *array;
    const cudaError err = cudaMalloc3DArray (&array, &sumsFormat, make_cudaExtent(352, 288, 1));
    fprintf(stderr, "Result : %s.\n", cudaGetErrorString(err) );
    return err;
}

Any help will be appreciated. Thanks!

To create an array of long4 data, simply use

long4 foo[NUMBER_OF_ELEMENTS];

In my experience cudaChannelFormatDesc is used in conjunction with textures. Textures are limited to the data types supported by texturing hardware, and I am fairly certain no types larger than 32 bits are supported for an individual channel, with up to four channels. So you could create an int2 texture, but not a long2 texture. I would suggest looking up relevant sections of the Programming Guide and the CUDA API documentation. Most CUDA documents are conveniently accessible online, for example cudaChannelFormatDesc is described here:

http://docs.nvidia.com/cuda/cuda-runtime-api/index.html#structcudaChannelFormatDesc

Yes, actually the cudaArray I have declared is intended to be accessed as a surface, but since the failure is reproducible even if I do not include the “cudaArraySurfaceLoadStore” flag, I have left it out to provide a minimal example. Maybe my original text was not very clear since I said “array” instead of “cudaArray”.

Where is it documented that textures larger than 32 bit per channel are not supported? I have searched high and low the documentation to no avail. Actually, since the cudaChannelFormatDesc are, indeed, only useful for textures/surfaces, why is the cudaCreateChannelDesc() template implemented also for T=long4? I have taken a look to the .h file and it is explicitly particularized for that typename.

Good point on the templates provided in the header file. Unfortunately, this exceeds the extent of my knowledge. I have used textures, but not surfaces. Various bits of information led me to believe that textures are limited to up to four components of at most 32 bits each. I can’t quote chapter and verse for this so my recollection may simply be wrong.

Have you tried getting your small example to work with another type such as int4? If that also doesn’t work there is probably an API usage issue somewhere. I have had trouble in the past with feeding some of the texture functions correctly, getting a lot of “invalid argument” errors along the way, until I finally understood how it wants to work. Looking at some relevant example codes provided with CUDA may help clarify the usage.

Yes, int4 works beautifully; I have successfully used surfaces based on int4 types.

Actually, the texture/surface functions for long4 are also explicitly defined in texture_fetch_functions.h and surface_functions.h. In both cases, they are inside a preprocessor block #if !defined(LP64). However, I get no errors or warnings when compiling the snippet above despite the fact that I am building a 64-bit application. Are these datatypes only meant to be used with 32-bit code? Weird.

These #ifdefs are peculiar. I do not have an explanation for these observations, and wasn’t able to track down what is / should be supported in regard to channel descriptors. A bug of some sort seems likely.

It would be helpful if you could file a bug report via the registered developer website, attaching a self-contained repro code (which it seems you already have in hand). Thank you for your help.

Thanks, will do.