cuDeviceGetAttribute shows i can use fabric handle, but actually i cannot

Here is a minimal reproducible example:

#include <cuda.h>
#include <cuda_runtime.h>
#include <stdio.h>

#ifndef CU_CHECK
#define CU_CHECK(cmd) \
do { \
    CUresult e = (cmd); \
    if (e != CUDA_SUCCESS) { \
        const char *error_str = NULL; \
        cuGetErrorString(e, &error_str); \
        printf("CUDA error: %s\n", error_str); \
        exit(1); \
    } \
} while (0)
#endif

int main() {
    cudaSetDevice(0);
    int support = 0;
    CU_CHECK(cuDeviceGetAttribute(&support, CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED, 0));
    printf("support: %d\n", support);

    CUmemAllocationProp prop = {};
    prop.type = CU_MEM_ALLOCATION_TYPE_PINNED;
    prop.location.type = CU_MEM_LOCATION_TYPE_DEVICE;
    prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_FABRIC;
    // prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR;
    prop.location.id = 0;

    size_t granularity = 0;
    CU_CHECK(cuMemGetAllocationGranularity(&granularity, &prop, CU_MEM_ALLOC_GRANULARITY_MINIMUM));

    size_t size = granularity;

    printf("granularity: %zu\n", granularity);

    CUmemGenericAllocationHandle handle;
    CU_CHECK(cuMemCreate(&handle, size, &prop, 0));

    return 0; 
}

Running it, I get:

support: 1
granularity: 2097152
CUDA error: operation not permitted

So it shows that, cuDeviceGetAttribute tells me fabric handle type is supported, but when i actually use it, it tells me operation not permitted. Switching the handle type to CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR works.

Is this a bug of cuDeviceGetAttribute, or expected behavior that it only means I can use this parameter but cannot guarantee successful invocation?

My environment is H100 DGX node with Driver Version: 570.133.20 and cuda version 12.5 .

In addition, I find that CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED is not listed in CUDA Driver API :: CUDA Toolkit Documentation , but only listed in CUDA Driver API :: CUDA Toolkit Documentation . Is this expected behavior?

Also checked it on DGX B200 node, Driver Version: 575.57.08 CUDA Version: 12.9:

support: 1
granularity: 2097152
CUDA error: operation not permitted

cuda driver tells me it is supported, but the actual allocation fails.

@Robert_Crovella could you please take a look?

It’s not clear what you are trying to do. The fabric handle expects IMEX. IMEX is typically used in a setting like GB200 NVL72 (i.e. where the NVLink Fabric provides an internode connection), not with a DGX or HGX system (i.e. where NVLink is only an intranode connection).

Detailed questions in this vein are well beyond what I can respond to at this time. I would interpret the observation as follows: The capability is supported, but other system level configurations are also required, which have not been satisfied. Therefore attempts to use it are not permitted.

You can always file a bug for documentation clarification.

What I want is to reliably tell whether i can use the fabric handle or not. And my first attempt is to query cuDeviceGetAttribute, however the result shows that cuDeviceGetAttribute cannot give me the answer (it tells me it is supported but later the operation will fail).

I understand the fabric handle is designed for multi-node nvlink systems, and i’m trying to find a way to detect if i’m in a multi-node nvlink system. I’m expecting cuDeviceGetAttribute to tell me fabric handle is not supported on a single HGX machine, but it answers yes, which confuses me a lot.

The fabric based memory sharing can work on both single node and multi-node nvlink systems.
The fabric handle based memory sharing is facilitated by the concept of an IMEX domain. An IMEX domain is either an OS instance or a set of compute nodes each with a separate OS instance and connected by NVLINK network on which NVIDIA IMEX service(daemon) has been installed and configured to communicate with each other. Within an IMEX domain, the IMEX channels allow for secure memory sharing in multi-user environments. The NVIDIA driver implements the IMEX channel facility by registering a new character device nvidia-caps-imex-channels. Applications that intend to use fabric handle based sharing must ensure:

  1. Nvidia-caps-imex-channels character device is created by the driver and is under /proc/devices.
# cat /proc/devices | grep nvidia
195 nvidia
195 nvidiactl
234 nvidia-caps-imex-channels
509 nvidia-nvswitch
  1. Have at least one IMEX channel file accessible by the user launching the application. An IMEX channel file is a node in /dev/nvidia-caps-imex-channels of the type channelN, where N is the minor number. The NVIDIA driver does not create a node that represents an IMEX channel by default unless the module parameter NVreg_CreateImexChannel0 was specified which automatically creates /dev/nvidia-caps-imex-channels/channel0 . The creation of these IMEX channel files must be handled by system administrators(using mknod() for example). When exporter and importer CUDA processes have been granted access to the same IMEX channel file, they can securely share memory. If the importer/exporter processes do not have access to the same IMEX channel file, the CUDA APIs attempting to create or import a fabric handle will return CUDA_ERROR_NOT_PERMITTED.

The IMEX channel security model works on a per user basis. This implies all processes under a user can share memory if the user has access to a valid IMEX channel. When multi-user isolation is desired, a separate IMEX channel is required for each user. Therefore the default IMEX channel0 creation using NVreg_CreateImexChannel0 is only recommended for single user cases. To create channelN with major number from /proc/devices users can execute the following command:
# mknod /dev/nvidia-caps-imex-channels/channelN c <major_number> 0

Please refer to this link as well: How to veify NVIDIA IMEX daemon? · Issue #19 · NVIDIA/multi-gpu-programming-models · GitHub
Note: This implies you don’t need to install/start/run the nvidia-imex daemon when your use case is just a single node.
You should not be trying to detect if you are on a multi-node system or not using this device attribute.

@vramesh1 then what does it mean when cuDeviceGetAttribute(&support, CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED, 0) returns supported? and how can i reliably tell if i’m in an IMEX domain?

for more context: I’m working for a pytorch feature, Improve IPC for Expandable Segments to use fabric handle when possible by youkaichao · Pull Request #156074 · pytorch/pytorch · GitHub :

I try to use fabric handle if possible, and fall back to posix fd if not possible.
that’s why I need to tell if fabric handle is supported.
right now i have to defer the query to the first call of real vmm api. however, if device query works, i can just query the support status when initializing the program.

what does it mean when cuDeviceGetAttribute(&support, CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED, 0) returns supported?

When device attribute query for CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED reports true (i.e. != 0) this essentially means the device allows CUDA VMM allocations to be exported or created with CUmemFabricHandle. It only reports the device’s ability to create fabric handles and not if the system has been fully set up for fabric handle based IPC.

how can i reliably tell if i’m in an IMEX domain?

I suppose we could consider extending the attribute query to check if you are part of an IMEX domain. But in the absence of that you could perform a dummy allocation creation to confirm you are in an IMEX domain.

thanks for the response! this solves my confusion now.

to be honest, it is quite confusing, and i think this results in a bug in nccl: Potential bug for requestedHandleTypes? · Issue #1763 · NVIDIA/nccl · GitHub . NCCL use device query to determine if they can use the fabric handle.

I suppose we could consider extending the attribute query to check if you are part of an IMEX domain

If you do, please add another device attribute field, instead of changing the current behavior ofCU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED . Since this field does not mean full support of fabric handle right now, I will not use it anymore.

But the code you point out does have a fallback to not use fabric handle if the allocation API fails with CUDA_ERROR_NOT_PERMITTED. It is not relying on CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED to mean fabric allocation will succeed.
I am not saying the device attribute (or a potential new one) doesn’t need to be revisited, I am questioning why the current version of NCCL has a bug?

if (requestedHandleTypes & CU_MEM_HANDLE_TYPE_FABRIC) {
      /* First try cuMemCreate() with FABRIC handle support and then remove if it fails */
      CUresult err = CUPFN(cuMemCreate(&handle, handleSize, &memprop, 0));
      if (err == CUDA_ERROR_NOT_PERMITTED || err == CUDA_ERROR_NOT_SUPPORTED) {
        requestedHandleTypes &= ~CU_MEM_HANDLE_TYPE_FABRIC;
        memprop.requestedHandleTypes = (CUmemAllocationHandleType) requestedHandleTypes;
        /* Allocate the physical memory on the device */
        CUCHECK(cuMemCreate(&handle, handleSize, &memprop, 0));
      } else if (err != CUDA_SUCCESS) {
        // Catch and report any error from above
        CUCHECK(cuMemCreate(&handle, handleSize, &memprop, 0));
      }
    }

they are enums and should be used exclusively.
is this a hidden behavior of cuda driver api that it only checks the highest bit?

Additionally, to your question on the github link the allocation properties allow you to specify zero or more handleType you would like to export this allocation to in future. If you specify prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_FABRIC | CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR; This means you can obtain a CUmemFabricHandle and a int(file descriptor) through two separate calls to the cuMemExportToShareableHandle() API.

the allocation properties allow you to specify zero or more handleType you would like to export this allocation to in future.

thanks, this makes sense. then i think the nccl code is correct.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.