cuMemAllocPitch document/comments in header still up to date ?

Hello,

I am wondering if the comments in cuda.h are still up to date for cuMemAllocPitch:

It mentions 4, 8 or 16 bytes for ElementSize.

The programming guide mentions alignment of 32, 64 and 128 bytes, maybe it’s a documentation mistake in guide (5.3.2.1) and should be bits ?

Or perhaps aligment is something else then these element sizes…

Also are large element sized supported ? Just wondering…

Also what would happen if elementsize is set to something weird like 3 or 40 ?!?

//
// brief Allocates pitched device memory
//
// Allocates at least WidthInBytes * Height bytes of linear memory on
// the device and returns in dptr a pointer to the allocated memory. The
// function may pad the allocation to ensure that corresponding pointers in
// any given row will continue to meet the alignment requirements for
// coalescing as the address is updated from row to row. ElementSizeBytes
// specifies the size of the largest reads and writes that will be performed
// on the memory range. ElementSizeBytes may be 4, 8 or 16 (since coalesced
// memory transactions are not possible on other data sizes). If
// ElementSizeBytes is smaller than the actual read/write size of a kernel,
// the kernel will run correctly, but possibly at reduced speed. The pitch
// returned in pPitch by cuMemAllocPitch() is the width in bytes of the
// allocation. The intended usage of pitch is as a separate parameter of the
// allocation, used to compute addresses within the 2D array. Given the row
// and column of an array element of type T, the address is computed as:

Bye,
Skybuck.

Hello,

I am wondering if the comments in cuda.h are still up to date for cuMemAllocPitch:

It mentions 4, 8 or 16 bytes for ElementSize.

The programming guide mentions alignment of 32, 64 and 128 bytes, maybe it’s a documentation mistake in guide (5.3.2.1) and should be bits ?

Or perhaps aligment is something else then these element sizes…

Also are large element sized supported ? Just wondering…

Also what would happen if elementsize is set to something weird like 3 or 40 ?!?

//
// brief Allocates pitched device memory
//
// Allocates at least WidthInBytes * Height bytes of linear memory on
// the device and returns in dptr a pointer to the allocated memory. The
// function may pad the allocation to ensure that corresponding pointers in
// any given row will continue to meet the alignment requirements for
// coalescing as the address is updated from row to row. ElementSizeBytes
// specifies the size of the largest reads and writes that will be performed
// on the memory range. ElementSizeBytes may be 4, 8 or 16 (since coalesced
// memory transactions are not possible on other data sizes). If
// ElementSizeBytes is smaller than the actual read/write size of a kernel,
// the kernel will run correctly, but possibly at reduced speed. The pitch
// returned in pPitch by cuMemAllocPitch() is the width in bytes of the
// allocation. The intended usage of pitch is as a separate parameter of the
// allocation, used to compute addresses within the 2D array. Given the row
// and column of an array element of type T, the address is computed as:

Bye,
Skybuck.