Error in printing record?

I am using compute sanitizer (CUDA 12.9) to monitor a process running a TensorRT inference loop that (after several days of runtime) crashes and causes a driver reset. While this might be a problem with the driver itself, I am trying my luck with the sanitizer. If I intentionally inject some GPU related errors (memory out-of-bounds writes or similar stuff) the sanitizer outputs them nicely, so I know it works and catches the problems. However, if I run it with my original process, it eventually outputs (after hours or days of runtime):

“Error in printing record”

Huh, what kind of error can cause the sanitizer to output this very message?

Thanks.

This sounds like a bug on our end. Can you provide reproduction instructions so we can investigate? Thanks!

I wish I could… This is actually related to this issue where I tried to get some help from the TRT guys. As you can read there this problem is hard to reproduce, you need to wait days for it to happen, put multiple GPUs under pressure (so kind of a stress bug) and be lucky…

The reason I cross posted here was that I hoped that this very message we finally got from the sanitizer after a long stress (>95 percent GPU load) run time can somehow narrow down the causes or point us in the right direction.