The VLA model weights occasionally overwriting when calling services in thor

We encountered an occasional problem. When using the service on thor to infer VLA model on the same data, different output results will appear in some time. After locating it, we found that the first time entering the service node, there is a probability that the loaded model parameters will change, just like the memory storing the parameters is overwritten. We call the service node as follows.

    def run(self):
        self.app = FastAPI()
        self.app.post("/act")(self.predict_action)
        uvicorn.run(self.app, host=self.args.host, port=self.args.port)

At the same time, we found that this problem will not occur if fastapi is not used. This problem will also not occur when using fastapi on orin. The base environment uses the officially recommended version. We used docker nvcr.io/nvidia/pytorch:25.08-py3.

Reloading the model weights when calling fastapi for the first time can temporarily circumvent the problem. But we want to know what might be the reason for overwriting parameters when calling fastapi? We will provide the necessary information if required.

Hi,

Have you tried the same steps on other devices, like an x86 machine?
This sounds like a problem from fastapi.

Without using fastapi, does the same issue occurs on the Orin?

Thanks.

We suspected the problem of fastapi, but it works well on orin, whether or not fastapi is used.

Hi,

Do you use the same fastapi version on Thor?
If not, could you give it a try?

Thanks.

we change the same version and get the same results. We have done a lot of experiments and this problem only occurs when using fastapi on thor.