Device memory is insufficient to use tactic BUT there is enough mem

Description

I run inference in Docker containers. Everything works fine with nvcr.io/nvidia/tensorflow:24.01-tf2-py3 (which includes TensorRT 8.6.1). However, after switching to nvcr.io/nvidia/tensorflow:25.01-tf2-py3 (which includes TensorRT 10.8.0), I started seeing unusual log messages:

2025-03-25 14:41:40.365859: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Tactic Device request: 1MB Available: 40513MB. Device memory is insufficient to use tactic.

These warnings appear repeatedly, even though there seems to be plenty of available memory. What exactly is it complaining about? Is this behavior expected?

Environment

nvcr.io/nvidia/tensorflow:25.01-tf2-py3

Hi @xxHn-pro ,
You probably need to clear your CPU memory first. Check free -m before starting TRT

Filesystem cache is one likely reason for high CPU memory. The GPU driver is unaware of this memory and not able to tell the kernel to free FS cache memory to make available GPU memory.

Pls let me know if you can give this a try and if it removes the warning.

Hi, @AakankshaS

I run the model, and the same warning exists. – Problem reproduction.

Then, I run the recommended command.

Singularity> free -m
total used free shared buff/cache available
Mem: 257598 44389 230114 15 20581 213209
Swap: 32767 132 32635

It seems that there are many memory available.

Then, I run the model again, there are still tons of warnings. Anything else that I can try?