CUDA_ERROR_INVALID_VALUE when creating tensor maps with swizzling

I’m trying to use tensor maps with swizzling, but any swizzle size that is smaller than the innermost tile size (i.e. 32B swizzle with a 64B tile) returns a CUDA_ERROR_INVALID_VALUE. I can’t find any requirement for this in either the PTX ISA or the cuTensorMapEncodeTiled documentation.

  1. Is this intentional?
  2. If so, is this invariant documented anywhere?

As an example of a failing configuration:

tensorDataType: CU_TENSOR_MAP_DATA_TYPE_FLOAT16
tensorRank: 3
globalDim: [6144, 6144, 1]
globalStrides: [6144, 37748736]
boxDim: [128, 32, 1]
elementStrides: [1, 1, 1]
interleave: CU_TENSOR_MAP_INTERLEAVE_NONE
swizzle: CU_TENSOR_MAP_SWIZZLE_128B
l2Promotion: CU_TENSOR_MAP_L2_PROMOTION_NONE
oobFill: CU_TENSOR_MAP_FLOAT_OOB_FILL_NONE 

Are you sure that this statement doesn’t apply?

When interleave is CU_TENSOR_MAP_INTERLEAVE_NONE and swizzle is not CU_TENSOR_MAP_SWIZZLE_NONE, the bounding box inner dimension (computed as boxDim[0] multiplied by element size derived from tensorDataType) must be less than or equal to the swizzle size.

It seems like your tensorDataType specifies 2 bytes per element, and your boxDim[0] is 128, so that would result in a product of 256, which is larger than your swizzle size.

Ah I was looking at the wrong section then, this seems more relevant to swizzle but is in the interleave section. Appreciate the quick response.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.