CUPTI Activity work in multi-thread environment?

I have written a simple demo to capture performance of my CUDA kernel in multi-thread environment.

My kernels are launched by thousands of host threads and my CUPTI Activity API Demo ‘s buffer are instantly full of records.

So im really wondering is there a way to just limit my cupti activity profiler to only one working thread,which means the profiler only record cuda apis and kernels that are launched by this unique thread?