Simple reduction: graph API

Hello,

I’m trying to do simple sum reduction (1, 1, 16) → (1, 1, 1) across last dimension, in order to implement softmax graph with backend API, but engine is not initialized for every configuration

W! CuDNN (v90300 75) function cudnnBackendGetAttribute() called:
w!     Info: Traceback contains 24 message(s)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: !is_NHWC
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_reduction_support_fort(node)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_node_support_fort(node_ptr)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_for_support()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: ptr->isSupported()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: finalize_internal()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: !is_NHWC
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_reduction_support_fort(node)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_node_support_fort(node_ptr)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_for_support()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: ptr->isSupported()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: finalize_internal()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: !is_NHWC
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_reduction_support_fort(node)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_node_support_fort(node_ptr)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_for_support()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: ptr->isSupported()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: finalize_internal()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: !is_NHWC
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_reduction_support_fort(node)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_node_support_fort(node_ptr)
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED_LAYOUT; Reason: check_for_support()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: ptr->isSupported()
w!         Warning: CUDNN_STATUS_NOT_SUPPORTED; Reason: finalize_internal()
w! Time: 2024-09-02T15:23:13.208488 (0d+0h+0m+0s since start)
w! Process=9356; Thread=13996; GPU=NULL; Handle=NULL; StreamId=NULL.


I! CuDNN (v90300 75) function cudnnBackendCreateDescriptor() called:
i!     descriptorType: type=cudnnBackendDescriptorType_t; val=CUDNN_BACKEND_ENGINE_DESCRIPTOR (2);
i! Time: 2024-09-02T15:23:13.210520 (0d+0h+0m+0s since start)
i! Process=9356; Thread=13996; GPU=NULL; Handle=NULL; StreamId=NULL.


I! CuDNN (v90300 75) function cudnnBackendCreateDescriptor() called:
i!     status: type=cudnnStatus_t; val=CUDNN_STATUS_SUCCESS (0);
i! Time: 2024-09-02T15:23:13.210520 (0d+0h+0m+0s since start)
i! Process=9356; Thread=13996; GPU=NULL; Handle=NULL; StreamId=NULL.


I! CuDNN (v90300 75) function cudnnBackendGetAttribute() called:
i!     descriptor: type=CUDNN_BACKEND_ENGINECFG_DESCRIPTOR:
i!         engine: type=CUDNN_BACKEND_ENGINE_DESCRIPTOR:
i!             opGraph: type=CUDNN_BACKEND_OPERATIONGRAPH_DESCRIPTOR:
i!                 reductionOp: type=CUDNN_BACKEND_OPERATION_REDUCTION_DESCRIPTOR:
i!                     xDesc: type=CUDNN_BACKEND_TENSOR_DESCRIPTOR:
i!                         type: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!                         nbDims: type=int; val=3;
i!                         dimA: type=int; val=[1,1,16];
i!                         strideA: type=int; val=[16,16,1];
i!                         uid: type=int64_t; val=1;
i!                         alignmentInBytes: type=int64_t; val=16;
i!                         isVirtual: type=bool; val=false;
i!                         isByVal: type=bool; val=false;
i!                     yDesc: type=CUDNN_BACKEND_TENSOR_DESCRIPTOR:
i!                         type: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!                         nbDims: type=int; val=3;
i!                         dimA: type=int; val=[1,1,1];
i!                         strideA: type=int; val=[1,1,1];
i!                         uid: type=int64_t; val=2;
i!                         alignmentInBytes: type=int64_t; val=16;
i!                         isVirtual: type=bool; val=false;
i!                         isByVal: type=bool; val=false;
i!                     reduceDesc: type=CUDNN_BACKEND_REDUCTION_DESCRIPTOR:
i!                         mathPrec: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!                         reduceOp: type=cudnnReduceTensorOp_t; val=CUDNN_REDUCE_TENSOR_ADD (0);
i!             engine_id: type=int; val=1;
i!             knobDesc: type=cudnnBackendDescriptor_t:
i!                 CUDNN_KNOB_TYPE_SPLIT_K_SLC: type=int; val=-1;
i!                 CUDNN_KNOB_TYPE_KERNEL_CFG: type=int; val=10;
i!             numericalNotes: type=cudnnBackendDescriptor_t:
i!                 CUDNN_NUMERICAL_NOTE_TENSOR_CORE: type=bool; val=true;
i!             behaviorNotes: type=cudnnBackendDescriptor_t:
i!                 CUDNN_BEHAVIOR_NOTE_RUNTIME_COMPILATION: type=bool; val=true;
i!     attributeName: type=cudnnBackendAttributeName_t; val=CUDNN_ATTR_ENGINECFG_ENGINE (300);
i!     attributeType: type=cudnnBackendAttributeType_t; val=CUDNN_TYPE_BACKEND_DESCRIPTOR (15);
i!     requestedElementCount: type=int64_t; val=1;
i!     elementCount: location=host; addr=000000C954F7E998;
i!     arrayOfElements: location=host; addr=000000C954F7E990;
i! Time: 2024-09-02T15:23:13.211530 (0d+0h+0m+0s since start)
i! Process=9356; Thread=13996; GPU=NULL; Handle=NULL; StreamId=NULL.


I! CuDNN (v90300 75) function cudnnBackendGetAttribute() called:
i!     status: type=cudnnStatus_t; val=CUDNN_STATUS_NOT_INITIALIZED (1001);
i! Time: 2024-09-02T15:23:13.215528 (0d+0h+0m+0s since start)
i! Process=9356; Thread=13996; GPU=NULL; Handle=NULL; StreamId=NULL.

Full log here: gist:40ade5ff9563ab0cec870519b352a898 · GitHub

Hi @andrii.horbokon, did you ever figure this out? I’m having the same issue.

@andrii.horbokon , @hsnyder ,

I think stand-alone reduction operation just supports 4D input tensor with layout NHWC (shown on your provided logging).
Detail reduction constraints found here: Developer Guide :: NVIDIA cuDNN Documentation

1 Like

@andrii.horbokon @nikonsugar

Are you able to successfully execute a reduction operation using the graph API? I am using NHWC layout but am getting message: Reason: !(reduction_op->isColumnReduction(currBackboneOpType) || reduction_op->isRowReduction(currBackboneOpType) || reduction_op->isAllReduction()) when reducing over last dimension.