NVSHMEM Setup

guilhermehartmannk8bwe · July 29, 2022, 4:56pm

Is it possible to prototype NVSHMEM solutions using a GTX3090 ? I have been testing installations but I get a problem when trying to use nvshmem_malloc. Is it recommended to use MPI + OpenShmem instead ? Do you have any specific tutorial to understand the problem ?

Error as seen:

WARN: GDRCopy open call failed, falling back to not using GDRCopy
src/mem/mem.cpp:416: non-zero status: 101 cuMemCreate failed
src/team/team_internal.cu:287: NULL value nvshmemi_psync_pool allocation failed
src/init/init.cu:762: non-zero status: 2 team setup failed
src/init/init.cu:796: non-zero status: 7 nvshmemi_common_init failed …src/init/init_device.cu:nvshmemi_check_state_and_init:44: nvshmem initialization failed, exiting
src/util/cs.cpp:21: non-zero status: 16: No such file or directory, exiting… mutex destroy failed

Robert_Crovella · July 30, 2022, 6:47pm

From here:

NVSHMEM requires the following hardware:
NVIDIA Data Center GPU of the NVIDIA Volta™ GPU architecture or later.

RTX3090 is not a datacenter GPU.

There are additional documentation and tutorial resources linked from here.
Scroll down to the bottom for the “Resources” section.

guilhermehartmannk8bwe · July 30, 2022, 7:47pm

Hi Robert,

Thank you for the reply, it wasn’t clear that the errors were due the card not being a datacentre card. I take that the 3090 would not be fit to develop a library that uses nvshmem, what are the recommended alternatives ?

Regards,

Guilherme

Robert_Crovella · July 30, 2022, 8:08pm

https://www.nvidia.com/content/dam/en-zz/solutions/data-center/data-center-gpu-portfolio-line-card.pdf

guilhermehartmannk8bwe · August 1, 2022, 8:36am

Thank you Robert, I was trying to find a solution to use the same codebase on NVSHMEM for single gpu and multi-gpu. I take that the solution is to use ifdefs to skip nvshmem calls.

Topic		Replies	Views
Internode nvshmme and ib problem GPU-Accelerated Libraries nvshmem	20	1669	April 24, 2024
NVSHMEM Compilling GPU-Accelerated Libraries nvshmem	5	814	January 2, 2024
NVSHMEM runtime error GPU-Accelerated Libraries nvshmem	11	2061	August 16, 2022
Raise error when link nvshmem in my application Legacy PGI Compilers cuda , cudnn	13	1741	January 2, 2024
Failure in installation of nvshmem GPU-Accelerated Libraries cuda , nvshmem	5	535	March 13, 2024
NVSHMEM Installation undefined reference to `__sync_synchronize' GPU-Accelerated Libraries nvshmem	2	348	June 13, 2024
NVSHMEM fails to compile using nvcc GPU-Accelerated Libraries hw , cuda , kernel	4	206	July 24, 2024
Potential NVSHMEM allocated memory performance issue GPU-Accelerated Libraries nvshmem	19	1676	May 10, 2024
NVSHMEM setup GPU-Accelerated Libraries gpu-computing	0	153	October 6, 2024
Running Nvshmem from custom build bootstrap GPU-Accelerated Libraries nvshmem	0	518	November 30, 2023

NVSHMEM Setup

Related topics