We are using jetson nano on the project, several dozen pieces. A Triton server is installed on the Jetsons to run models in tensorrt format. All jetsons are one-of-a-kind, with the same power supplies, sd cards, housings, ventilators, and software.
But on some jetsons after a random time (from seconds to hours) the Triton server stops working with the following error:
To give a further suggestion, could you share the detailed steps to reproduce this error in our environment.
Please also share the failure rate and the (roughly) reproduced time with us.
perf_client (7.9 MB) It turned out to be repeated using perf_client from the release from the Triton version 2.0 repository. “./perf_client -m detection” command executed multiple times.
Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
Thread [0] had error: pinned buffer: failed to perform CUDA copy: the launch timed out and was terminated
We try to reproduce this on multiple Nano with connection > 50 times.
But cannot reproduce the error as you shared.
Guess that this error might be related to the connection pattern.
Would you mind sharing more details about how do you connect the server with us?
(ex. multiple connections at the same time?)
I managed to figure out the error. The link to a similar case helped a lot. TensorRT inference context in ROS callback - #11 by ec2020 The error happened due to multithreading and CUDA. We ran the Triton server in a docker container and specified the CPU limits for the container.
For unknown reasons, in some jetsons everything worked perfectly, in some after a while this error occurred. Everything turned out to be fixed by removing completely the limits on the container and updating the jetson to the latest version of jetpack 4.5. Thanks for the help!
I am facing the exact same error while serving in Jetson. I am assuming it may be an issue related to the CUDA context(not exactly sure), can you please elaborate on how you solved the same issue. Would be a great help from yourside.