I had a function that takes about 30Hz for one iteration. It reads Kinect color and depth frame, then detect an color and yields 3D coordinate of the object. I just put it into a pthread and created second thread that has an empty while loop. And, the performance of the first function went down to 15Hz… Is that because the process run on the same core? If so, when I use two threads in c++, can each one of the process run on a core exclusively?
FYI, I have overclocked and turned on all cores
GPU bus speed:
GPU memory speed:
Unless it is a hardware IRQ, your thread+core combinations are determined by the scheduler. Sometimes it is actually good to share a core if cache data will be a hit rather than a miss. If your kinect code is done in kernel via a hardware IRQ, it’ll only be serviced on CPU0. If your code associated with the kinect is in user space, then any core may apply. If your user space part of the program has a data bottleneck based on what the driver can feed it, then the two will be locked together for performance (meaning it is possible the kinect processing could go faster if the driver were to feed it faster…but I don’t know, you’d need to profile it).
Multiple threads will have the opportunity to run on multiple cores, dependent upon scheduler. You can always do things like setting a higher priority and perhaps boosting performance and reducing latency. Having the Jetson itself in performance mode is probably required before you decide if performance was really reduced or increased. See:
Note that CUDA really requires threading to take advantage of it, either directly or via libraries linked to it.