Jetson TK1 CUDA performance in multithreaded app

Daniil · September 21, 2015, 1:22pm

Hi, folks!

I have a custom-made CUDA library and benchmark it in maximum CPU & GPU performance mode (Jetson/Performance - eLinux.org). Benchmark is a simple performance test, like call function 100 times and take median time (actually it’s OpenCV’s perf test). Got reasonable numbers. I also have multithreaded application, that processes frames from videofile with this library and displays the result. However, processing time in application at least twice bigger, than in performance test. That only happens in maximum performance mode, with default settings, times are the same. I’ve profile gpu and memory load, it’s about 10%, so it’s not a resource issue (also there is no such problem on PC). Could you, please, help me with this “maximum performance mode” problem?

Topic		Replies	Views
Does CPU processing performance low after GPU processing(CUDA) at TK1? Jetson TK1	2	641	March 31, 2016
[Jetson-TK1] RAM clock CPU-GPU Hybrid Processing Slow Jetson TK1	3	1994	February 13, 2015
is there any max performance turnning script for TK1? Jetson TK1	2	785	October 18, 2021
Jetson TK1 performance bottleneck CUDA Programming and Performance	4	2714	February 10, 2016
Performance spikes on Jetson TX1 using CUDA multithreading Jetson TX1	2	714	October 18, 2021
OpenCV Benchmarks: opencv_perf_gpu Jetson TX1 opencv	2	2655	March 28, 2016
Profiling of Application on Jetson TK1 Jetson TK1	1	947	February 22, 2016
TK1 very slow GPU initialization Jetson TK1	12	1405	October 18, 2021
kenel overhead time in Jetson TX1? Jetson TX1	5	719	October 18, 2021
Power profiling CUDA on the Jetson Tk1 Jetson TK1	2	1036	October 5, 2015

Jetson TK1 CUDA performance in multithreaded app

Related topics