Tensor packing and cryptic errors

So, anybody know what this means?

Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: !rfo->getXDesc()->isNSpatialC_fullyPacked()

I’m guessing it has something to do with packing…of tensors…The docs kinda explain tensor packing (fully/partially/spatially), but it’s not really clear. Here’s my tensor creation code (once again)

void create_tensor_descriptor(cudnnBackendDescriptor_t& desc, int64_t n, int64_t c, int64_t h, int64_t w, int64_t uid)
{
	std::cout << "Creating Tensor Descriptor..." << std::endl;
	cudnnBackendCreateDescriptor(CUDNN_BACKEND_TENSOR_DESCRIPTOR, &desc);

	cudnnDataType_t dtype = CUDNN_DATA_FLOAT;
	int64_t alignment = 4;
	cudnnBackendSetAttribute(desc, CUDNN_ATTR_TENSOR_DATA_TYPE, CUDNN_TYPE_DATA_TYPE, 1, &dtype);
	cudnnBackendSetAttribute(desc, CUDNN_ATTR_TENSOR_BYTE_ALIGNMENT, CUDNN_TYPE_INT64, 1, &alignment);

	int64_t xDim[] = { n, c, h, w };
	int64_t xStr[] = { c * h * w, h * w, w, 1 };
	cudnnBackendSetAttribute(desc, CUDNN_ATTR_TENSOR_DIMENSIONS, CUDNN_TYPE_INT64, 4, &xDim);
	cudnnBackendSetAttribute(desc, CUDNN_ATTR_TENSOR_STRIDES, CUDNN_TYPE_INT64, 4, &xStr);
	cudnnBackendSetAttribute(desc, CUDNN_ATTR_TENSOR_UNIQUE_ID, CUDNN_TYPE_INT64, 1, &uid);
	cudnnBackendFinalize(desc);
}

Any thoughts, or is this forum dead?
-Chris

Alright, it seems the Resample Operator doesn’t dooooo color images. You can’t have different channels, you have to tell the tensor descriptor during creation that your channels are batches instead, like so:

create_tensor_descriptor(image_desc, NUM_CHANNELS, 1, IMAGE_SIZE, IMAGE_SIZE, 0);

This seems wrong as C (in NCHW) is described in the docs as “the number of feature maps”. This seems like it would cause issues with convolutions, as the output of a convolution creates feature maps in the C dimension…how would you do max pooling on those features if the resample operator flips out if you have a tensor with features in the C dim?

Is this stuff just broken? Is there anyone here that knows why this is set this way?

-Chris

PS There is an error in the docs. The CUDNN_BACKEND_CONVOLUTION_DESCRIPTOR lists an attribute CUDNN_ATTR_CONVOLUTION_MODE, but it should be CUDNN_ATTR_CONVOLUTION_CONV_MODE.

PSS For CUDNN_BACKEND_CONVOLUTION_DESCRIPTOR and CUDNN_BACKEND_OPERATION_CONVOLUTION_FORWARD_DESCRIPTOR, both don’t say that CUDNN_ATTR_CONVOLUTION_SPATIAL_DIMS is a needed or even a valid attrib for either, but in their Finalization sections, they both talk about it. Is it needed or not?

Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: !is_correlation

Ah, I should have known…

-Chris

Compute capability: 8.6
cuDNN version: 8401

Creating handle...
Creating Tensor 'Conv Input'...
Creating Tensor 'Conv Filter'...
Creating Tensor 'Conv Output(VIRTUAL)'...
Creating Convolution Descriptor...
Creating Convolution Operation Descriptor...
Creating Tensor 'Activation Output'...
Creating Tanh Activation Descriptor...
Creating Tanh Activation Operation Descriptor...
Creating Graph...

W! CuDNN (v8401) function cudnnBackendFinalize() called:
w!         Error: CUDNN_STATUS_NOT_INITIALIZED; Reason: LinearPatternMatcher::matchPattern(userGraph, doOpBinding)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: (userGraph->getAllNodes().size() != 4) && (userGraph->getAllNodes().size() != 8)
w! Time: 2022-06-07T11:45:40.903241 (0d+0h+0m+1s since start)
w! Process=89360; Thread=56044; GPU=NULL; Handle=NULL; StreamId=NULL.

Creating Engine...
Creating Plan...

W! CuDNN (v8401) function cudnnBackendFinalize() called:
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: !is_correlation
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: check_conv_support_fort(node)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: check_node_support_fort(node_ptr)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: check_for_support()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: ptr.isSupported()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: engine_post_checks(handle, *ebuf.get(), engine.getPerfKnobs(), req_size)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: finalize_internal()
w! Time: 2022-06-07T11:45:40.905241 (0d+0h+0m+1s since start)
w! Process=89360; Thread=56044; GPU=NULL; Handle=NULL; StreamId=NULL.


Cleanup...
Destroying Tensor...
Destroying Tensor...
Destroying Tensor...
Destroying Tensor...

Lil help here please. I can’t connect any operation. Conv → activation or conv->maxpool, all go to sh*t. Docs are trash, and fail at explaining this stuff. I can run just the convolution, or just the activation, or just the maxpool. But that’s kinda useless in the end. Need to connect the operations!

Hi,

We are looking into this issue, please allow us some time to get back on this.

Thank you.

Hi ezbDoubleZero, thanks for bring this to our attention! Let me try to help you with your use cases:

please refer to the fusion examples in our c++ frontend

High level suggestions for your use cases:

  1. If you need to use the runtime fusion engine, tensors need to be in fully packed NHWC layout, as this is the native tensor core layout, you can use code like below to compute the strides
    int64_t xDim = { n, c, h, w };
    int64_t xStr = { h * w * c, 1, w * c, c };
    Regarding your comments about channels need to be 1, that’s not a requirement. for the float tensor type you are using, you need to make sure input and output channels are multiple of 4 for Volta/Turing GPU or it can be any number for Ampere GPU with the latest cuDNN 8.4.0

  2. for convolutions, use CUDNN_CROSS_CORRELATION mode if you can - the other mode is not supported in the runtime fusion engine right now.

With 1 and 2, conv → activation should work.

  1. we have been working on improving the documentation,
    Developer Guide :: NVIDIA Deep Learning cuDNN Documentation
    See the limitation of “The input tensor to a Resample operation should not be produced by another operation within this graph, but should come from global memory.”. This means it’s currently not possible to fuse a resample directly at the output of a convolution. This is because the spatially neighboring pixels are not always available with the implicit-gemm convolution algorithm being used.

  2. we are working on adding pooling examples in the frontend

  3. Thanks for catching the documentation issues, our engineers will fix them asap.

Let us know if you have any other issues

Thanks for replyin’.

Ah, the docs, yeah, that seems to be the biggest issue. I used the example in API Reference :: NVIDIA Deep Learning cuDNN Documentation (Use Case) where it clearly isn’t NHWC:

int64_t xDim[] = {n, g, c, d, h, w};
int64_t xStr[] = {g * c * d * h * w, c *d *h *w, d *h *w, h *w, w, 1};

I understand the reason to tell the tensor the dim sizes AND the strides, but I don’t get why you can have the dims as NCHW, while the data is NHWC. I say, keep it one way only; the way the data is needed. Less confusion is good. :)

I read the limitations on the resample operation. Wasn’t sure what it meant. I think I have a confusion about how the backend API should be used. But then again, nothing I tried to do worked other than individual operations.

I will try some other things, post more code, yada yada. I will most likely just go back to using the individual library functions for a bit as well.

Thanks again, hope ya get nearest neighbor soon! :)
-Chris