I am trying to create a compressed texture in CUDA, but am struggling to figure out the correct way of allocating the GPU array. Below is a complete example which results in a CUDA error 1: invalid argument on the cudaMallocArray call.
This seems to be due to the cudaCreateChannelDesc<cudaChannelFormatKindUnsignedBlockCompressed3>(), if I replace this with cudaCreateChannelDesc<uint4>() for example, it works, but I’m not sure that is the right way to allocate the array for a compressed texture?
When I compile and run your code on a machine equipped with CUDA 11.5, it runs without error for me.
(Aside: your error checking macro looks a little odd to me. But I suppose it may be harmless. The use of x in the output line has the side effect of re-running the command.)
These items here and here may be of interest to other readers.
I happen to be using 495.25.05 driver. The 470.57.02 driver is not typically compatible with CUDA 11.5, so that may be an issue, although I would have expected a different error message.
Unless you are using a datacenter GPU along with a known valid install of the compatibility libraries, the first thing I would do is upgrade your GPU driver install to one that is compatible with CUDA 11.5 (or 11.6, if you are using CUDA 11.6).
And the confusion over the error message may be due to a mixed/corrupted install of CUDA. If you have been installing various CUDA versions without proper care, or the history of your machine is uncertain, that may also be something to scrub.
Later: I believe (my) confusion over the error message is due to the CUDA “minor version compatibility” that was introduced with CUDA 11, as described here.
As a follow up, I’m now getting the error: CUDA error 27: read as normalized float not supported for 32-bit non float type in texture.cpp at 43 when trying to create the actual texture object. I’m explicitly setting the readMode to cudaReadModeElementType (seen below) so I don’t understand what is going on here?
after review by the dev team, it seems that the error 27 actually indicates an incorrect read mode setting. (The text for error 27 is admittedly confusing in light of this, and that is being looked at.)
The error 27 can be eliminated by setting read mode to normalized float:
The view descriptor is incorrect (it is larger than the texture) and is not needed in order to access the texture, although this issue is separate from the error 27 discussion.
So the view descriptor is not necessary at all? When would you actually need one of those?
For the size: I was trying to follow along here but I think I misinterpreted how to create the view descriptor since in my case above I am not reinterpreting uint4 and am just trying to create a properly typed texture to begin with.
pResViewDesc is an optional argument that specifies an alternate format for the data described by pResDesc, and also describes the subresource region to restrict access to when texturing. pResViewDesc can only be specified if the type of resource is a CUDA array or a CUDA mipmapped array.
Later in that doc section there is additional information about the view descriptor starting with: