Use CC in multi-GPU system (with NvSwitch)

I tried utilizing the tool on one of the 8 GPUs, setting the CC mode to devtools:

> sudo python3 gpu_cc_tool.py --gpu=0 --set-cc-mode devtools --reset-after-cc-mode-switch

2023-09-21,00:21:48.000 INFO     Selected GPU 0000:18:00.0 H100-PCIE 0x2324 BAR0 0xee042000000                                                                                                                                             
2023-09-21,00:21:48.193 INFO     GPU 0000:18:00.0 H100-PCIE 0x2324 BAR0 0xee042000000 CC mode set to devtools. It will be active after GPU reset.                                                                                          
2023-09-21,00:21:49.864 INFO     GPU 0000:18:00.0 H100-PCIE 0x2324 BAR0 0xee042000000 was reset to apply the new CC mode.

I attach this GPU to a CVM. However, any CUDA application would make the kernel driver to crash. Crash location in driver code:

// kernel-open/nvidia-uvm/uvm_channel.c

static NV_STATUS channel_manager_create_conf_computing_pools(uvm_channel_manager_t *manager, unsigned *preferred_ce)
{
    // Skip some code

    status = channel_pool_add(manager, UVM_CHANNEL_POOL_TYPE_SEC2, 0, &sec2_pool); // This function returned with status NV_OK

    // Skip some code

    status = channel_pool_add(manager, UVM_CHANNEL_POOL_TYPE_WLC, wlc_lcic_ce_index, &wlc_pool); // Crashed here
}

Driver output:

[30838.220082] nvidia-uvm: uvm_channel.c:1532 uvm_channel_check_errors[pid:2503] Channel error likely caused by push 'GPFIFO submit to 'ID 1:3 (0x1:0x3) WLC 2' via 'UVM_CHANNEL_TYPE_SEC2'' started at uvm_channel.c:1247 in submit_ctrl_gpfifo_indirect()
[30838.222513] nvidia-uvm: uvm_channel.c:1547 uvm_channel_check_errors[pid:2503] Assert failed, condition 0 not true: Fatal error: Generic RC error [NV_ERR_RC_ERROR]

I add printf in uvm_channel_get_status (called by uvm_channel_check_errors). The errorNotifier->status is -1 (0xffff).

I did not configure the NvSwitch device. Furthermore, I did not passthrough the NvSwitch to CVM.

What’s the problem here?

By the way, if I set the CC mode to on, the error happens at another place. Its output is shown at Is "CC mode = on" available? - #3 by Yifan-Tan.