NVSHMEM fails to compile using nvcc

yao.877 · July 9, 2024, 8:00pm

Hi, I am trying to build NVSHMEM and use it in my program.

I follow the instruction on NVSHMEM Installation to build.

My software are:
OpenMPI: 4.1.4
UCX: 1.13.1
CUDA: 11.8
GCC: 8.5.0
CMAKE: 3.26.3
NVSHMEM: 2.5.0-19
NCCL: 2.23

I use the following command to compile nvshmem:

export CUDA_HOME=/opt/cuda/11.8
export NVSHMEM_USE_GDRCOPY=1
export NVSHMEM_MPI_SUPPORT=1
export MPI_HOME=/ompi/install
export NVSHMEM_GPUINITIATED_SUPPORT=1
export NVSHMEM_PREFIX=/nvshmem_src_2.5.0-19/install
export NVSHMEM_USE_NCCL=1
export NCCL_HOME=/nccl/build
export GDRCOPY_HOME=/opt/gdrcopy/2.3

cd nvshmem_src_2.5.0-19
make -j install

After compilation, I got

which seems succeeded.

Then I use nvcc to compile the offical example:

#include <stdio.h>
#include <cuda.h>
#include <nvshmem.h>
#include <nvshmemx.h>

__global__ void simple_shift(int *destination) {
    int mype = nvshmem_my_pe();
    int npes = nvshmem_n_pes();
    int peer = (mype + 1) % npes;

    nvshmem_int_p(destination, mype, peer);
}

int main(void) {
    int mype_node, msg;
    cudaStream_t stream;

    nvshmem_init();
    mype_node = nvshmem_team_my_pe(NVSHMEMX_TEAM_NODE);
    cudaSetDevice(mype_node);
    cudaStreamCreate(&stream);

    int *destination = (int *) nvshmem_malloc(sizeof(int));

    simple_shift<<<1, 1, 0, stream>>>(destination);
    nvshmemx_barrier_all_on_stream(stream);
    cudaMemcpyAsync(&msg, destination, sizeof(int), cudaMemcpyDeviceToHost, stream);

    cudaStreamSynchronize(stream);
    printf("%d: received message %d\n", nvshmem_my_pe(), msg);

    nvshmem_free(destination);
    nvshmem_finalize();
    return 0;
}

nvcc nvShmemTest.cu -o nvShmemTest -arch=compute_80 -code=sm_80 -L/ompi/install/lib -I/ompi/install/include -L/nccl/build/lib -I/nccl/build/include -I/nvshmem_src_2.5.0-19/install/include -L/nvshmem_src_2.5.0-19/install/lib -lmpi -lnccl -lnvshmem

But it gives the following error:

ptxas fatal   : Unresolved extern function 'nvshmem_my_pe'

It seems like link error but I have configured the correct include and lib path to nvShmem.

Any help is appreciated!!

MatColgrove · July 9, 2024, 9:48pm

Since this forum is for the NVHPC Compilers, I’m moving your post over to the GPU Accelerated Library forum.

-Mat

Robert_Crovella · July 9, 2024, 10:49pm

In this case it is necessary to compile with relocatable device code linking. Try adding -rdc=true to your compile command line.

yao.877 · July 10, 2024, 3:56am

Thanks @Robert_Crovella !!! It works!

system · July 24, 2024, 3:56am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
NVSHMEM program fails to initialize Other Tools	0	353	November 16, 2020
problem making shared library works with gcc but not nvcc CUDA Programming and Performance	1	2019	August 27, 2008
NVSHMEM Compilling GPU-Accelerated Libraries nvshmem	5	785	January 2, 2024
nvcc compile error nvopencc error CUDA Programming and Performance	4	4231	August 15, 2008
NVSHMEM Installation undefined reference to `__sync_synchronize' GPU-Accelerated Libraries nvshmem	2	324	June 13, 2024
Using NVSHMEM in Building Pytorch Operator GPU-Accelerated Libraries	7	1264	April 27, 2021
Failure in installation of nvshmem GPU-Accelerated Libraries cuda , nvshmem	5	490	March 13, 2024
how to use nvcc CUDA Programming and Performance	5	11678	April 18, 2008
nvcc not working CUDA Programming and Performance	0	6606	May 31, 2011
Makefiles and nvcc CUDA Programming and Performance	7	31152	May 29, 2011

NVSHMEM fails to compile using nvcc

Related topics