Creating a compressed texture object

Hi,

I am trying to create a compressed texture in CUDA, but am struggling to figure out the correct way of allocating the GPU array. Below is a complete example which results in a CUDA error 1: invalid argument on the cudaMallocArray call.

This seems to be due to the cudaCreateChannelDesc<cudaChannelFormatKindUnsignedBlockCompressed3>(), if I replace this with cudaCreateChannelDesc<uint4>() for example, it works, but I’m not sure that is the right way to allocate the array for a compressed texture?

#include <stdio.h>
#include <unistd.h>
#include <iostream>

#include <cuda_runtime_api.h>
#include <cuda.h>
#include <channel_descriptor.h>

#define CudaCheck(x) \
    { \
        cudaError_t err = x; \
        if (err != cudaSuccess) { \
            std::cout << "CUDA error " << x << ": " << cudaGetErrorString(err) << " in " << __FILE__ << " at " << __LINE__ << std::endl; \
            exit(2); \
        } \
    }

int main() {
    cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<cudaChannelFormatKindUnsignedBlockCompressed3>();
    cudaArray_t deviceTexArray = nullptr;
    CudaCheck(cudaMallocArray(&deviceTexArray, &channelDesc, 64, 64));
    
    return 0;
}

When I compile and run your code on a machine equipped with CUDA 11.5, it runs without error for me.

(Aside: your error checking macro looks a little odd to me. But I suppose it may be harmless. The use of x in the output line has the side effect of re-running the command.)

These items here and here may be of interest to other readers.

Well that’s weird…I am also using CUDA 11.5.

I am compiling with nvcc <file> -o output and then running from commandline with just ./output. This is on ubuntu 18.

The output of nvcc --version is:

Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

and I’m using drivers 470.57.02.

Is there anything I am missing?

edit: Tried with 11.6 as well just now and same result

Puzzling.

$ cat t183.cu
#include <stdio.h>
#include <unistd.h>
#include <iostream>

#include <cuda_runtime_api.h>
#include <cuda.h>
#include <channel_descriptor.h>

#define CudaCheck(x) \
    { \
        cudaError_t err = x; \
        if (err != cudaSuccess) { \
            std::cout << "CUDA error " << (int)err << ": " << cudaGetErrorString(err) << " in " << __FILE__ << " at " << __LINE__ << std::endl; \
            exit(2); \
        } \
    }

int main() {
    cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<cudaChannelFormatKindUnsignedBlockCompressed3>();
    cudaArray_t deviceTexArray = nullptr;
    CudaCheck(cudaMallocArray(&deviceTexArray, &channelDesc, 64, 64));

    return 0;
}
$ nvcc -o t183 t183.cu
$ compute-sanitizer ./t183
========= COMPUTE-SANITIZER
========= ERROR SUMMARY: 0 errors
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
$

I happen to be using 495.25.05 driver. The 470.57.02 driver is not typically compatible with CUDA 11.5, so that may be an issue, although I would have expected a different error message.

Unless you are using a datacenter GPU along with a known valid install of the compatibility libraries, the first thing I would do is upgrade your GPU driver install to one that is compatible with CUDA 11.5 (or 11.6, if you are using CUDA 11.6).

And the confusion over the error message may be due to a mixed/corrupted install of CUDA. If you have been installing various CUDA versions without proper care, or the history of your machine is uncertain, that may also be something to scrub.

Later: I believe (my) confusion over the error message is due to the CUDA “minor version compatibility” that was introduced with CUDA 11, as described here.

Upgrading my drivers to 495 looks like it fixes the issue. Thanks for the help!

As a follow up, I’m now getting the error: CUDA error 27: read as normalized float not supported for 32-bit non float type in texture.cpp at 43 when trying to create the actual texture object. I’m explicitly setting the readMode to cudaReadModeElementType (seen below) so I don’t understand what is going on here?

#include <stdio.h>
#include <unistd.h>
#include <iostream>
#include <cstring>

#include <cuda_runtime_api.h>
#include <cuda.h>
#include <channel_descriptor.h>

#define CudaCheck(x) \
    { \
        cudaError_t err = x; \
        if (err != cudaSuccess) { \
            std::cout << "CUDA error " << x << ": " << cudaGetErrorString(err) << " in " << __FILE__ << " at " << __LINE__ << std::endl; \
            exit(2); \
        } \
    }

int main() {
    cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<cudaChannelFormatKindUnsignedBlockCompressed3>();
    cudaArray_t deviceTexArray = nullptr;
    CudaCheck(cudaMallocArray(&deviceTexArray, &channelDesc, 64, 64));

    cudaResourceDesc resDesc{};
    std::memset(&resDesc, 0, sizeof(resDesc));
    resDesc.resType = cudaResourceTypeArray;
    resDesc.res.array.array = deviceTexArray;

    cudaTextureDesc texDesc{};
    std::memset(&texDesc, 0, sizeof(texDesc));
    texDesc.addressMode[0] = cudaAddressModeBorder;
    texDesc.addressMode[1] = cudaAddressModeBorder;
    texDesc.addressMode[2] = cudaAddressModeBorder;
    texDesc.filterMode = cudaFilterModePoint;
    texDesc.readMode = cudaReadModeElementType;
    texDesc.normalizedCoords = 0;

    cudaTextureObject_t texObj = 0;
    cudaResourceViewDesc resViewDesc{};
    resViewDesc.format = cudaResViewFormatUnsignedBlockCompressed3;
    resViewDesc.height = 256;
    resViewDesc.width = 256;
    CudaCheck(cudaCreateTextureObject(&texObj, &resDesc, &texDesc, &resViewDesc));
    
    return 0;
}

I suggest filing a bug.

after review by the dev team, it seems that the error 27 actually indicates an incorrect read mode setting. (The text for error 27 is admittedly confusing in light of this, and that is being looked at.)

The error 27 can be eliminated by setting read mode to normalized float:

texDesc.readMode = cudaReadModeNormalizedFloat;

and removing the view descriptor:

CudaCheck(cudaCreateTextureObject(&texObj, &resDesc, &texDesc, NULL));

The view descriptor is incorrect (it is larger than the texture) and is not needed in order to access the texture, although this issue is separate from the error 27 discussion.

Ah ok, yeah that error was quite confusing.

So the view descriptor is not necessary at all? When would you actually need one of those?

For the size: I was trying to follow along here but I think I misinterpreted how to create the view descriptor since in my case above I am not reinterpreting uint4 and am just trying to create a properly typed texture to begin with.

From here:

pResViewDesc is an optional argument that specifies an alternate format for the data described by pResDesc, and also describes the subresource region to restrict access to when texturing. pResViewDesc can only be specified if the type of resource is a CUDA array or a CUDA mipmapped array.

Later in that doc section there is additional information about the view descriptor starting with:

The cudaResourceViewDesc struct is defined as…