Can Jetson Orin support nccl?

Can Jetson Orin support NCCL for communication among processes on the same chip or among pcie connected different chips?
Also, what is the best way to do inter-chip data communication for Jetson Orin platform? NVSci? Or NCCL? What is the. difference between these two approaches?

I can’t answer all of that, but I see this URL for NCCL:
https://developer.nvidia.com/nccl

On that it suggests it communicates over PCIe or NVLink. Jetson GPUs do not use PCI, they are directly integrated with the memory controller. Thus, there is no possibility of this working on a Jetson (there is also no NVLink).

I don’t know about the NVSci software, but if it requires PCIe or NVLink, then this would also not work. If it uses something else, then maybe??

As I known, multiple Orin chips seem can be connected using PCIe. If so, is that meaning that the NCCL may be used for Jetson inter-chip communications?
But it seems that on Jetson platform NvSci is an officially recommended way for exchanging data among multiple Orin chips. But what is the. difference between using NvSci and NCCL (if can be used) for inter-Orin communication?
Also one confusing problem for me is that whether we can use NvSci for inter-process communication on DataCenter case with A100 GPU and Intel CPUs? I tried out the NvSci example on A100 GPU + Intel CPUs environment, but it fails when trying to import the allocated buffer in NvSci into CUDA (the same code works exactly fine on Orin Platform).
From what I have interpreted so far:
1. NvSci should be used for inter-chip communication for Orin platform ;
2. NCCL can not be used by inter-chip communication for Orin platform ;
3. NvSci cannot be used for Data center case with A100 + Intel cpus;
Are these right? (Probably wrong…)

As I known, multiple Orin chips seem can be connected using PCIe. If so, is that meaning that the NCCL may be used for Jetson inter-chip communications?

But it seems that on Jetson platform NvSci is an officially recommended way for exchanging data among multiple Orin chips. But what is the. difference between using NvSci and NCCL (if can be used) for inter-Orin communication?

Also one confusing problem for me is that whether we can use NvSci for inter-process communication on DataCenter case with A100 GPU and Intel CPUs? I tried out the NvSci example on A100 GPU + Intel CPUs environment, but it fails when trying to import the allocated buffer in NvSci into CUDA (the same code works exactly fine on Orin Platform).

From what I have interpreted so far:

  1. NvSci should be used for inter-chip communication for Orin platform ;
  2. NCCL can not be used by inter-chip communication for Orin platform ;
  3. NvSci cannot be used for Data center case with A100 + Intel cpus;
    Are these right? (Probably wrong…)

I couldn’t tell you about the different programs. I just know that Jetsons do not use PCIe, and that much of the software you see for outside the world of Jetsons mandates using PCIe since this is how the GPUs are detected. A100 GPUs are PCIe. I know one can use SLI between two closely spaced PCIe-based video cards, but the interconnect in data centers is a different hardware, and neither format of connector is available on a Jetson.

Someone else may know a workaround if NvSci is possible. In the case of software which works with a Jetson’s iGPU (integrated, versus the PCIe discrete dGPU) some of the memory functions may be specialized since dGPUs have their own memory, but an iGPU must share with system memory.

The NCCL was implemented and optimized for multi-GPU system to achieve high bandwidth over PCIe and NVLink high-speed interconnect, such as cloud server and HPC, it’s not for Jetson platform.

Hi,
NCCL is not supported on Jetson platforms. For NvSciStream, please take a look at
Jetson Download Center | NVIDIA Developer

Thank you for the reply.

By the way, does NvSciStream support HPC/cloud server case using A100s + Intel CPUs ? I have tried NvSciStream example on A100, but the exactly same code (which succeeds on Orin) fails when import external memory using “cudaImportExternalMemory” on A100 with “InvalidValueError”.

Thank you for the reply.

By the way, does NvSciStream support HPC/cloud server case using A100s + Intel CPUs ? I have tried NvSciStream example on A100, but the exactly same code (which succeeds on Orin) fails when import external memory using “cudaImportExternalMemory” on A100 with “InvalidValueError”.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.