GB10 Platform Does Not Support Nemotron-Parse Container

The Nemotron-Parse container does not appear to support the GB10 platform, as it is built for the ARM architecture.

You can run from sources:

git lfs install
git clone https://<username>:<token with write access>@huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.1
cd NVIDIA-Nemotron-Parse-v1.1
uv venv --python 3.12 --seed
source .venv/bin/activate
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130
uv pip install "git+https://github.com/raphaelamorim/vllm.git@nemotron_parse" 
uv pip install timm albumentations bs4 open_clip_torch

Create a file inference.py



from vllm import LLM, SamplingParams
from PIL import Image


sampling_params = SamplingParams(
    temperature=0,
    top_k=1,
    repetition_penalty=1.1,
    max_tokens=9000,
    skip_special_tokens=False,
)

llm = LLM(
    model="nvidia/NVIDIA-Nemotron-Parse-v1.1",
    max_num_seqs=64,
    limit_mm_per_prompt={"image": 1},
    dtype="bfloat16",
    trust_remote_code=True,
)

image = Image.open("./equation.png")

prompts = [
    {  
        "prompt": "</s><s><predict_bbox><predict_classes><output_markdown>",
        "multi_modal_data": {
            "image": image
        },
    },
    {  # Explicit encoder/decoder prompt
        "encoder_prompt": {
            "prompt": "",
            "multi_modal_data": {
                "image": image
            },
        },
        "decoder_prompt": "</s><s><predict_bbox><predict_classes><output_markdown>",
    },
]

outputs = llm.generate(prompts, sampling_params)

for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Decoder prompt: {prompt!r}, Generated text: {generated_text!r}")

run python inference.py, if everything is running fine is pretty straightforward to build a container image from it.

Thanks! I finally solved the problem using the steps below.

docker-compose.yml

services:
  nemotron-parse2503:
    container_name: nemotron-parse2503
    working_dir: /app
    ports:
      - "8910:8000"
    build: .
    volumes:
      - ${HOME}/.cache/huggingface/:/root/.cache/huggingface/
      - ${HOME}/project/nemotron-parse:/app
    environment:
      - TZ=Asia/Taipei

    gpus: all
    ipc: host             
    ulimits:             
      memlock: -1
      stack: 67108864

    stdin_open: true
    tty: true
    command: bash

Dockerfile

FROM nvcr.io/nvidia/pytorch:25.03-py3

WORKDIR /app

COPY requirements.txt /tmp/requirements.txt
RUN pip install --upgrade pip \
 && pip install -r /tmp/requirements.txt

COPY . /app

requirements.txt

accelerate==1.12.0
albumentations==2.0.8
transformers==4.51.3
timm==1.0.22
1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.