i am try to use gpu at qwen.cpp, but when i use cpu only, its good , when add gpu, it will Burst memory, please tell me why? thanks
Hello,
Thanks for visiting the NVIDIA Developer forums! Your topic will be best served in the Jetson category.
I will move this post over for visibility.
Cheers,
Tom
Hi,
We have the examples to use GPU in graphics and deep learning use-cases. Please check
/usr/src/nvidia/graphics_demos/
/usr/src/jetson_multimedia_api/samples/04_video_dec_trt/
/usr/src/jetson_multimedia_api/samples/18_v4l2_camera_cuda_rgb/
GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
NVIDIA Metropolis - NVIDIA Docs
You may check these and develop your use-case based on the examples.
Hi,
VLM requires memory to read the model.
For example, you can find some memory usage of llama:
In our previous test, it requires at least 8GB of memory to run Qwen-2B with quantized 4bits.
Thanks.
thanks
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.