Best Practice for Using Two DGX Spark Nodes as GPU Workers With Existing WSL2 Service

Hello,

We currently operate a production service running on Windows + WSL2 (Podman), with a company homepage deployed there.
We have recently added two DGX Spark systems, and we would like to use them for GPU-accelerated applications (Gradio-based web apps).

Before we redesign everything, we would like to confirm the recommended direction:

Environment

- Windows Server (NGINX reverse proxy + WSL2 Podman deployment)
- Two DGX Spark units (to run GPU workloads)

Questions

  1. Is it recommended to join the two DGX Spark machines into a small Kubernetes cluster and use them as GPU worker nodes, while keeping the existing homepage on the current WSL2 environment temporarily?

  2. For GPU computing on DGX Spark, should we use:

    • Kubernetes + NVIDIA GPU Operator, or

    • Run containers directly with Docker / Podman / Slurm for simplicity at small scale?

  3. If we later migrate everything to Kubernetes, is it okay to move the existing reverse proxy to Ingress, or is it better to keep Windows NGINX and forward traffic into the cluster?

We are just looking for best-practice guidance on choosing the right direction before architecture migration.

Thank you.