How to know multi-thread program was running on different cores

My program has a for-loop and each iteration carries out an expensive calculation. I used pthread to divide the for-loop into 4 threads and each thread processed only 1/4 of the iterations. However, I did not ways much speed up. I am not sure why. How can I know the threads are allocated to different cores and were running in parallel?

you can check on which core a thread is currently running with PSR field:

ps -o pid,lwp,psr,comm

If you want to bind a thread to a given core, you may read this topic.

For testing I’d also make sure clocks are maxed out:

sudo ~nvidia/