Error profiling application with multiple cpu threads


I’ve developed an application that runs multiple CPU threads. All of them running GPU-kernels in parallel.
When I try to profile the application with cuda visual profiler the profiler behaves erratic, giving random errors (-91, -92, … -97).
The errors do not occur when the application runs only a single thread.

Is this a limitation of the profiler or is something else wrong? The release notes do not mention any issues regarding this.