NVSHMEM Host Execution Pattern

Daniel_Wong · May 10, 2021, 6:05pm

Hi, All

I am writing NVSHMEM program, and launch the program with the following command in the case of two GPUs on the same host.

nvshmem_src_2.0.3-0/scripts/bin/nvshmrun -n 2 -ppn 2 ./test_nvshmem

I found the execution flow of NVSHMEM is like

---->host-thread-0--->GPU_device_kernel-0 on GPU-0---->
---->host-thread-1--->GPU_device_kernel-1 on GPU-1---->

I just wondering if there is any way that I can use NVSHMEM as the following execution pattern on host and device

---->host-thread--->GPU_device_kernel-0 on GPU-0---->host-thread
               \--->GPU_device_kernel-1 on GPU-1---->/

because I want the host thread for doing some common tasks that will be the same for both threads and GPU kernels. And both host threads should use the same host memory instead of duplicating two separate memory on host.

Thanks!

alanger · April 18, 2022, 6:59pm

NVSHMEM does not support threads as PE model. PEs have to be processes. And each PE is assigned only 1 GPU.

system · March 16, 2023, 10:47pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
NVSHMEM without mpi, 1 thread for each GPU on a node- howto initialize? GPU-Accelerated Libraries nvshmem	1	850	April 18, 2022
Unable to run NVSHMEM example with slurm GPU-Accelerated Libraries nvshmem	4	490	March 31, 2024
Timing for NVSHMEM program GPU-Accelerated Libraries nvshmem	1	851	April 18, 2022
Nvshmem with MPS with numGpu + 1 MPI process GPU-Accelerated Libraries nvshmem	0	36	April 3, 2025
NVSHMEM on multi-node GPUs GPU-Accelerated Libraries cuda , nvshmem	8	2714	January 18, 2024
How to run nvshmemx_uint64_wait_until_on_stream concurrently? GPU-Accelerated Libraries nvshmem	1	261	April 8, 2024
NVSHMEM on multi-node GPUs failed . My gpu is A5000 GPU-Accelerated Libraries nvshmem	5	985	April 1, 2024
Nvshmem support only 8 gpus at most at 1 node GPU-Accelerated Libraries nvshmem	3	73	April 18, 2025
How does NVSHMEM achieve GPU initiated RDMA? GPU-Accelerated Libraries cuda	5	149	February 28, 2025
NVSHMEM applicability in setting of two PCIe connected GPUs GPU-Accelerated Libraries nvshmem	6	1034	October 12, 2021

NVSHMEM Host Execution Pattern

Related topics