Would you please provide your comments for our usage case?
In case of using the DeepOps with management server and DGX-Station, how should we share the learning data on DGX-Station?
We would like to avoid the access the DGX-Station directly from each developers.
I think that we should use the NFS server on DGX-Station and use the RAID as NFS storage.
If you have any questions, please let me know.
Best regards.
Kaka
Are you using Kubernetes, or Slurm, or just having users use each DGX Station as a standalone system (not controlled with a job scheduler)?
Also, remember that as a DGX customer, you can always contact NVIDIA Enterprise Support ( Enterprise Customer Support | NVIDIA ) to get more real-time assistance with your DGX product - including questions and issues with DeepOps software. We’re happy to use this forum to communicate and help too, but want to make sure you know there’s a formal support path as well!
You’ll want to add the server and client DGX Station systems to the [nfs-server] and [nfs-clients] sections of the deepops/config/inventory file, and then modify deepops/config/group_vars/all.yml to setup the new_exports to where you’re exporting on the “server” DGX Station, and similarly where you want to mount it to on the “client” DGX Stations. E.g., if you were going to export “/data” on the server and want it visible as “/mnt/data” on the clients, the config would look like: