I’m running a pytorch+tensorrt program on Tesla T4 GPU, with cuda 11.3, driver 460.
It seems compute-sanitizer consumes an extra about 8GB memory, and making me encountering OOM problem every time.
Any way to solve this?
Let me check with our engineering team and see if we have encountered this issue before.
Which tool are you using, e.g. memcheck, initcheck, racecheck? Each one has a different memory overhead. The team also mentioned that the newer versions of the tool have memory improvements are compatible with 11.x toolkits. Would you be able to try the version included in CTK 11.8 and see if the performance improves?
Hi,thanks for reply. I just used compute-sanitizer ./my_application, not specifying other options. I suppose using all of memcheck and racecheck etc. How shall I only use memcheck then? or only racecheck? The independent cuda-memcheck tool in my toolkit seems corrupted since it has error “Initialization failed” for a simple program.
I may try the tool in CTK11.8 later. It will take a while for downloading.
By the way, I’m using multiple context in my application.
You can try the memcheck tool with a commandline like " compute-sanitizer --tool memcheck [sanitizer_options] app_name [app_options]"
Could you try that and let me know if it completes. The memcheck tool is one of the lower overhead checks.
Thanks, using it this way as you mentioned, memcheck can complete normally for the simple program that I mentioned before.
For my complex program, memcheck and racecheck and initcheck have the “out of memory” problem, while syncheck can complete normally. These are in CTK 11.3
For now I have solved the problem in my own program, and don’t intend to try CTK11.8, forgive me.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.