Running Nvshmem from custom build bootstrap

i built the bootstrap from a working mpich installation that does D2D transfers. But when i run the code with nvshmem the error is

“src/topo/topo.cpp:68: [GPU 3] Peer GPU 0 is not accessible, exiting …
src/init/init.cu:714: non-zero status: 3 building transport map failed
src/topo/topo.cpp:68: [GPU 2] Peer GPU 0 is not accessible, exiting …
src/init/init.cu:714: non-zero status: 3 building transport map failed
src/topo/topo.cpp:68: [GPU 1] Peer GPU 0 is not accessible, exiting …
src/init/init.cu:714: non-zero status: 3 building transport map failed
MPICH ERROR [Rank 0] [job id 856f0066-5f09-464e-abb4-f43c0a029cdb] [Tue Nov 28 10:54:12 2023] [x3005c0s19b1n0] - Abort(139008270) (rank 0 in comm 0): Fatal error in PMPI_Alltoall: Message truncated, error stack:
PMPI_Alltoall(427)…: MPI_Alltoall(sbuf=0xa5a2180, scount=16, MPI_BYTE, rbuf=0xa5a2130, rcount=16, datatype=MPI_BYTE, comm=comm=0x84000002) failed
MPIR_Alltoall_impl(259)…:
MPIR_Alltoall_intra_auto(170)…: Failure during collective
MPIR_Alltoall_intra_auto(166)…:
MPIR_Alltoall_intra_pairwise(95):
progress_recv(174)…: Message from rank 3 and tag 9 truncated; 16 bytes received but buffer size is 40
MPIR_Alltoall_intra_pairwise(95):
MPIDIG_handle_unexp_mrecv(79)…: Message from rank 2 and tag 9 truncated; 16 bytes received but buffer size is 40
MPIR_Alltoall_intra_pairwise(95):
MPIDIG_handle_unexp_mrecv(79)…: Message from rank 3 and tag 9 truncated; 16 bytes received but buffer size is 40”

While if i run the same code with same MPI without nvshmem it is able to do GPU-GPU direct data transfer using NVLINK

Can anyone please provide the solution