We are currently running a two-staged DeepStream pipeline with PGIE (Object Detection) and SGIE (Object Detection). What we have observed is after a while the swap space on TX2NX (which is 2GB) fills up and so does the RAM. This causes the application to slow down and in some cases stop the downstream tasks like relay actuation and so forth.
The application architecture is slightly complex but the following are the Docker containers that run at the same time:
- Django application for APIs
- Kafka container for message passing
- DeepStream based application
Right after running the application,
jtop looks like following:
After several hours of running (usually 4 - 5),
jtop looks like following:
As you can see the memory has gone up and so has swap space.
We have tried using several tools to profile the applications like valgrind, cuda-memcheck, heaptrack and have not found any significant leaks in the application.
- Valgrind: detects 800+MB leaks in
libcuda.so. But according to our research (refer link) valgrind reports false positives with cuda so we didn’t take this very seriously.
- cuda-memcheck: no leaks/errors
- heaptrack: no major memory leaks (detects ~6MB of memory leak in
gstreamerwhich should be OK)
We have also tried restarting certain containers to boil down the issue as restarting the containers frees a certain amount of swap memory. The exact memory freed in most cases is arbitrary and has the following range:
- Restarting Kafka: frees 600 MB - 1.1 GB of Swap
- Restarting DeepStream: frees 500 - 1.5 GB of Swap
- Restarting APIs, and NGINX have almost no effect.
We have also things like setting the swappiness using
/etc/sysctl.conf but it didnt change anything.
Disabling custom processing and postprocessing
To rule out the possibility of memleak in our custom code, we disabled all gstreamer probes, and even bbox parsing functions where the only thing that was allowed to run was the DeepStream pipeline (only PGIE would run at this point, as we disabled bbox parsing, no objects would reach SGIE) and we still observed slow increase in the swap space. This strengthens argument (1) in suspicion below.
- Memory leak in DeepStream
I am 99% confident that there are no leaks in the custom code that is developed. For critical components like buffer conversion and creation, we are making sure we use appropriate unmapping, and destruction of streams.
- Write caching
I have read that Jetson devices write to swap memory first before flushing to disk to improve latency. We are using SPD-logger extensively to flush logs so maybe too much disk I/O is causing Jetson to cache in the swap space?
• Hardware Platform: Jetson TX2NX
• DeepStream Version: 5.1
• JetPack Version: 4.5.1
• Issue Type: bug/question