I have been running Caffe on the TK1 and I have noticed that I get hardly any improvement in performance with the forward or backward pass times when linking to cuDNN v2. The improvement is around a factor 1.05x better than without cuDNN. Is this to be expected for cuDNN on the TK1 ? Has anyone seen better acceleration ?
Have you maximized CPU, GPU and EMC clocks?
More about setting those can be found from the wiki:
Yes, I did. The overall effect was to reduce the latency for inference. But the cuDNN performance still was only about 1.05x better than without linking to the library.
I’m not sure the result whether it’s normal, but you could refer to below reference performance numbers on other HW configuration to know the gap: