Hi Nvidia Team,
I am evaluating LAMMPS on DGX1 by referring documentation at
LAMMPS | NVIDIA NGC and getting the following errors .
root@723f39985703:/# mpirun -n 4 --allow-run-as-root lmp -k on g 4 -sf kk -pk kokkos gpu/direct on neigh full comm device binsize 2.8 -var x 8 -var y 8 -var z 8 -in /host_pwd/in.lj.txt
LAMMPS (15 Jun 2020)
KOKKOS mode is enabled (src/KOKKOS/kokkos.cpp:85)
will use up to 4 GPU(s) per node
using 1 OpenMP thread(s) per MPI task
ERROR: Illegal package kokkos command (src/KOKKOS/kokkos.cpp:397)
Last command: package kokkos gpu/direct on neigh full comm device binsize 2.8
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[61988,1],0]
Exit code: 1
--------------------------------------------------------------------------
I am not sure what is happening at kokkos.cpp:397. Can you suggest me how should I debug the issue and if there is any solution?
Can you also review the following LAMMPS documentation? This seems incorrect to me.
LMP_CMD="lmp -k on g -sf kk -pk kokkos gpu/direct on neigh full comm device binsize 2.8 -var x 8 -var y 8 -var z 8 -in /host_pwd/in.lj.txt"
Where:
``: Set to the number of GPU's available per compute system. For a local workstation this is the total number of GPUs.
I could not find `` in the above command. The same goes for the following command under Docker Section.
${DOC_RUN} mpirun -n ${LMP_CMD}
Where: ``: The number of GPUs available on the local workstation. This should match the value in LMP_CMD