Jax container create and deploy flask in kubernetes to use gpu

  • Our model required GPU to use Nvidia and integrated with JAX and working fine in VM
  • Now we created container and deployed in K8S
  • Installing jax libery in python base image is more dependency issue getting
  • looking on this docker images https://github.com/NVIDIA/JAX-Toolbox/tree/main for reference
  • Looking some support for how to make container image with jax and nvidia and deploy in k8s for flask to server request

Hi, I think you’re on the right track using Docker image (e.g. ghcr.io/nvidia/jax:jax) from JAX Toolbox.

If your use case is inference, I would also recommend Triton Inference Server. You can use the Python backend with JAX to run inference.

  • Can you direct me dockerfile of Jax image creation repository to understand version
  • Can you provide me sample to create image and deploy k8s in nvidia instance and test

Here are some additional resources:

JAX Dockerfile

JAX Toolbox model server

JAX on GKE tutorial

Thank you for the reference let test in my cluster incase any issue in the doc let me ask you .

've updated my base image and included the necessary steps in my Dockerfile for my application. Here is my current Dockerfile:

Blockquote

FROM ghcr.io/nvidia/jax:jax

RUN apt-get update && apt-get install -y net-tools && apt-get clean && rm -rf /var/lib/apt/lists/*

RUN pip install -U --pre jaxlib -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_cuda12_releases.html && \
    pip install --upgrade "jax[cuda12_local]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html && \
    pip install -U "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

RUN pip install --upgrade transformers flax

# For GPU support with PyTorch
RUN pip3 install torch torchvision torchaudio

COPY requirements.txt .
RUN pip install -r requirements.txt
WORKDIR /app

COPY . /app

CMD ["python", "app.py"]

I need to ensure that my Docker container can access the GPU. Here are my questions:

  1. Running the Image with GPU Access: How should I run my Docker image to ensure it can access the GPU? Do I need to use specific Docker runtime options?
  2. Kubernetes Setup: Do I need to install any NVIDIA drivers or perform additional setup on my Kubernetes worker nodes to enable GPU access for my container?

Any guidance or suggestions would be greatly appreciated. Thank you

OMG, does this represent the mountain I have to clime? I have been in IT my entire career and I do not understand any of this alien gorgon tech talk. What’s in the flask? What gubbernets? JAX? But I do know what Python is.

We’re diving into some cool tech stuff! First off, we’re using Flask, a web application framework in Python. It’s awesome for building web apps super fast and with minimal code.

Then, there’s Kubernetes, also known as “K8s” or “gubbernets” in some circles. This open-source platform is a lifesaver for automating deployment, scaling, and managing containerized applications. It’s like the conductor orchestrating a complex symphony of apps in production.

Now, onto the juicy part: we’ve got some Python code and JAX in the mix to run our transformer model. And guess what? Our infrastructure is sitting pretty in a Kubernetes cluster in Google Cloud Platform (GCP), with NVIDIA A100 GPUs to power things up.

Here’s the kicker: we’re itching to build a Docker image based on JAX and unleash the beast of GPU services inside Kubernetes. We’ve been tinkering around and found a neat blog that kinda explains it, but we’re still not crystal clear on how the GPU device connects to the pod. Check it out here: [link to the blog].

Exciting times ahead as we dive deeper into the techhttps://cloud.google.com/blog/products/containers-kubernetes/machine-learning-with-jax-on-kubernetes-with-nvidia-gpus

Looking for some quick help to build and create docker image an deploy sample

I am a recently retired I.T. manager. I have 12 years experience as a database analyst and 3 years as a systems analyst. (+25 years as a mgr) I did this work primarily as a COBOL programmer, with SQL and a dash of various 5 generation languages. In this new world of AI I find myself starting as a raw newbie, Acronyms and terminology are an obstacle.
What is JAX? What is a Docker image? What is the POD? What I am interested in is a real time emotionally expressive anime avatar fully controlled by the chat AI. I am sure you have heard of the popular “VTubers” on YouTube. These animations are run by human actors using real time motion capture (MoCap) and real time rendering of 3D anime models carefully crafted to be expressive without too much processing overhead. I want to replace the human actors. I understand the limited space and scope above for your announcement.
Generally speaking I have been frustrated by the terminology in github, hugging face, and even the Nvidia developer forum. I am currently experimenting with Nvidia’s ChatRTX, an alpha product under development.
This is just FYI, I do not expect a response.