Jetson nano 8G Run VLM Benchmark

milk2_yu · March 3, 2025, 3:28am

Hi ,

The picture show from AI Lab Benchmarks. Where those VLM model from? Where has the test sample code?

I have try ollama llava 1.6-7B model, my token/sec have more then 3 on jetson nano 8G super.

But in pictrue the llava 1.6-7B only has 0.57 token/sec, Why?

AastaLLL · March 3, 2025, 5:05am

Hi,

The score is generated with MLC.
You can find the script in the below link:

github.com/dusty-nv/jetson-containers

packages/llm/mlc/benchmark.sh

master

#!/usr/bin/env bash
#
# Llama benchmark with MLC. This script should be invoked from the host and will run 
# the MLC container with the commands to download, quantize, and benchmark the models.
# It will add its collected performance data to jetson-containers/data/benchmarks/mlc.csv 
#
# Set the HUGGINGFACE_TOKEN environment variable to your HuggingFace account token 
# that has been granted access to the Meta-Llama models.  You can run it like this:
#
#    HUGGINGFACE_TOKEN=hf_abc123 ./benchmark.sh meta-llama/Llama-2-7b-hf
#
# If a model is not specified, then the default set of models will be benchmarked.
# See the environment variables below and their defaults for model settings to change.
#
# These are the possible quantization methods that can be set like QUANTIZATION=q4f16_ft
#
#  (MLC 0.1.0) q4f16_0,q4f16_1,q4f16_2,q4f16_ft,q4f16_ft_group,q4f32_0,q4f32_1,q8f16_ft,q8f16_ft_group,q8f16_1
#  (MLC 0.1.1) q4f16_0,q4f16_1,q4f32_1,q4f16_2,q4f16_autoawq,q4f16_ft,e5m2_e5m2_f16
#
set -ex

This file has been truncated. show original

Thanks.

milk2_yu · March 4, 2025, 8:12am

It like for LLM Benchmark.
But i want for VLM.
Thanks

AastaLLL · March 5, 2025, 6:48am

Hi,

The VLM benchmark is generated with the huggingface script with 4-bit quantization.
Thanks.

system · March 26, 2025, 1:51am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
VLM Refresh Rate Jetson AGX Orin camera , generative_ai	6	129	December 19, 2024
MiniCPM-Llama3-V-2_5 live on Jetson Orin Jetson AGX Orin generative_ai	11	572	August 9, 2024
Can't run NanoVLM on Jetson Orin NX 16GB Jetson Orin NX generative_ai	4	255	May 16, 2024
Jetson Orin Nano Super Dev Kit Performance Jetson Orin Nano cudnn , gemma-2-9b-it , llama-31-8b-instruct , llama	6	410	January 28, 2025
Failed to MLC-compile mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC on Jetson AGX orin Jetson AGX Orin generative_ai , llama-31-8b-instruct , llama	5	67	January 13, 2025
Available with Small Language Model on tutorial Jetson Orin Nano generative_ai	3	594	May 3, 2024
Jetpack6.2+TensorRT OOM issue Jetson Orin Nano generative_ai , llama	7	96	February 21, 2025
Local_llm vs NanoLLM: Help Getting NanoLLM up & running Jetson Orin Nano generative_ai	7	960	April 17, 2024
Jetson orin nano local small models perform insanely slow Jetson Orin Nano generative_ai	2	533	June 6, 2024
Small LLMs and Mini VLMs on Orin Nano Jetson Projects generative_ai	0	1292	March 5, 2024

Jetson nano 8G Run VLM Benchmark

Related topics