Hello NVIDIA Community,
We’re working on a project to detect exams in medical requisition images using the Gemma 3-27b-it model. Our preliminary tests with this model have shown great promise.
However, we noticed that Gemma 3-27b-it is not yet available on the NGC (NVIDIA GPU Cloud) Docker registry, and we’re aiming to apply for the USD 100k credit program through NVIDIA’s Startup Program. One of the requirements states:
“Your company has demonstrated NVIDIA technology adoption, including GPU usage and SDK/API usage on AWS.”
Here are our questions:
- Is it acceptable to spin up an AWS instance with a GPU (e.g., using Hugging Face) to run the Gemma 3-27b-it model and still meet the requirement for NVIDIA technology adoption?
- We’re wondering if this approach counts as using NVIDIA GPUs/SDKs even though we’re not pulling a container directly from the NGC registry.
- Do we need to wait for the official availability of the Gemma 3-27b-it model on the NGC registry before we can demonstrate our GPU usage on AWS for the credits?
- Alternatively, can we use NVIDIA’s SDK or REST API without necessarily using a GPU on AWS and still fulfill the requirement?
Any guidance or experiences from the community or NVIDIA representatives would be greatly appreciated.
Thank you very much!
- Yes, spinning up an AWS GPU instance (such as a g5.2xlarge, p4d, or p5 instance) and running your Gemma 3-27b-it model does count as demonstrating NVIDIA GPU usage for the Startup Program requirement.
You are leveraging NVIDIA GPUs via AWS, which satisfies the GPU adoption criteria.
- Yes, it still counts.
While pulling containers from NGC is encouraged, it is not mandatory for credit eligibility.
The key requirement is active use of NVIDIA GPUs and preferably NVIDIA SDKs/APIs, whether your workload runs from Hugging Face, custom code, or elsewhere.
- No, you do not need to wait.
You can proceed now with demonstrating your use of Gemma 3-27b-it on an AWS GPU instance.
You can mention that you’re utilizing NVIDIA GPU-backed instances, even if the model isn’t yet hosted on NGC.
- Technically yes, but it is strongly preferred to demonstrate both:
Use of NVIDIA GPUs (e.g., via AWS EC2 GPU instances) and
NVIDIA SDKs (e.g., TensorRT-LLM, Triton Inference Server, DeepStream, etc.)
Using only REST APIs without showing GPU usage may weaken your case for the credits.
Additional Guidance:
Since Gemma 3-27b-it is available via NVIDIA’s build.nvidia.com platform, you can use it directly from here to set up your inference:
Quickstart snippet:
Or alternatively:
You can optimize and deploy your LLM on AWS instances using NVIDIA TensorRT-LLM (TRT-LLM) for best performance:
Using TRT-LLM shows stronger NVIDIA technology adoption (lower latency, higher efficiency), which may strengthen your startup credit application.