I am new to Triton inference server running on Jetson, and trying out Nvidia’s Triton Jetson example concurrency_and_dynamic_batching with Jetson Nano (4Gb RAM).
When converting model .etlt file to tensorRT, Nano reports insufficient device memory and skips all tactics:
[ERROR] Tactic Device request: 2168MB Available: 1536MB. Device memory is insufficient to use tactic.
[WARNING] Skipping tactic 1 due to oom error on requested size of 2168 detected for tactic 1.
[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
I have freed up RAM by disable GUI.
- At idle status, the RAM usage = 0.5G/4.1GB (lfb 376x4MB)
- At runtime of model conversion, RAM usage= ~3G/4.1GB (lfb 87x4MB)
Attached is device status at runtime. I am not sure if it’s due to RAM, as I tried to increase swap and free up more RAM, but the error msg is the same, I don’t see any improvement on available memory(1536MB).
Anyone successfully run this Jetson example on Nano? Do I need to modify configs such as batch size, and how?
If not, anyone can help to direct me an alternative triton example on Jetson with a smaller model? Thanks!
Jetpack 4.6 [LT 32.6.1]