Jetson TK1 performance issues

I’ve been running CUBLAS matrix multiply on the TK1, and I’ve been observing some strange behaviors.

  1. When running the default size of 320x640, the performance varies from ~2.5-4.5 ms.
  2. When I turn off the desktop and just ssh in to run matrix multiply, the performance actually drops by almost 8x.
    Desktop on matrix multiply (640x1280) ~10 ms
    Desktop off matrix multiply (640x1280) ~85 ms.

Anybody have any idea why this might be happening?

Thank you.

The GPU clock swings from 72 to 852 MHz. I think it actually starts at 12 MHz when headless and after a reboot.

For all the reasons you can imagine the K1’s GPU isn’t immediately pegged to its maximum speed.

But for development and benchmarking purposes some unsupported overrides were revealed here.

It’s behaving much better now.
On my synthetic ubenchmark, I’m getting 308 GFLOP/s.