Cufftmp compilation problem

Why does the following happen when I compile the sample code for cufftmp in cudasample and run it dual-threaded

*With a single thread it can run

my makefile
include …/

exe = cufftmp_c2c

all: $(exe)

.PHONY: clean

rm -rf $(exe)

$(exe): $(exe).cu
${CUDA_HOME}/bin/nvcc $< -o $@ -lcuda -std=c++17
-L${CUFFT_LIB} -I${CUFFT_INC} -lcufftMp
-I${MPI_HOME}/include -L${MPI_HOME}/lib -l${MPI}

build: $(exe)

run: $(exe)
LD_LIBRARY_PATH=“${NVSHMEM_LIB}:${CUFFT_LIB}:${LD_LIBRARY_PATH}” mpirun -oversubscribe -n 2 $(exe)
MPI_HOME ?= /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi
NVSHMEM_LIB ?= /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/comm_libs/11.7/nvshmem/lib/
CUDA_HOME ?= $(shell dirname $$(command -v nvcc))/…
CUFFT_LIB ?= /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/math_libs/11.7/targets/x86_64-linux/lib
CUFFT_INC ?= /opt/nvidia/hpc_sdk/Linux_x86_64/22.5/math_libs/11.7/targets/x86_64-linux/include/cufftmp
ARCH ?= $(shell uname -m)
ifeq ($(ARCH), ppc64le)
MPI ?= mpi_ibm
MPI ?= mpi

This is not a compilation problem.

You may wish to search for descriptions of the problem “mpi failed to bind memory” when running a code inside a container.