Comparing TK1 and TX1 GPU specs with OpenCV4Tegra mog2 algorithm

usajpn · December 9, 2016, 12:09pm

Hello,
I’m trying to compare the GPU specs of TK1 and TX1 through OpenCV4Tegra’s mog2 algorithm and would like to know if my experiment results are correct.

When running code like below in a video loop,
the results were like this.

□ENV:
1 minute avi sample video which is 1920x1080@60fps.
So the processed frame in total is 3600.
I calculated the average processing time of each frame.

□RESULT:
Processing Time using mog2() function
TK1: 17.7355ms, TX1: 12.6975ms

Upload + Download time using d_frame.upload() and d_frame.download()
TK1: 11.6649ms, TX1: 5.004511ms

The result of the GPU memory upload + download time is as expected
But I was thinking that the processing time would be far more
less in the case of TX1 since TX1’s GFLOPS is more than 2.5 times larger than that of TK1.

Can someone help me out on whether this result would be correct especially the processing time?

Thank you.

Mat fgmask;

                        gettimeofday(&t1, NULL);

                        d_frame.upload(cap);

                        gettimeofday(&t2, NULL);

            mog2(d_frame, d_fgmask, mog2_param.learningCoef);

                        gettimeofday(&t3, NULL);

                        d_fgmask.download(fgmask);

                        gettimeofday(&t4, NULL);

                        /* upload time */
                        elapsedTime = (t2.tv_sec - t1.tv_sec) * 1000.0;
                        elapsedTime += (t2.tv_usec - t1.tv_usec) / 1000.0;
                        cout << elapsedTime << ",";

                        /* processing time */
                        elapsedTime = (t3.tv_sec - t2.tv_sec) * 1000.0;
                        elapsedTime += (t3.tv_usec - t2.tv_usec) / 1000.0;
                        cout << elapsedTime << ",";

                        /* download time */
                        elapsedTime = (t4.tv_sec - t3.tv_sec) * 1000.0;
                        elapsedTime += (t4.tv_usec - t3.tv_usec) / 1000.0;
                        cout << elapsedTime << endl;

kayccc · December 12, 2016, 7:20am

Hi usajpn,

When we benchmark TK1 vs TX1, we have seen some degradation, and some improvement, by deep SW architecture level synchronization.

To confirm if this is the problem; you can check the log for the execution from both and compare the time spent on the GPU. TX1 should be faster, if not, the slowdown is caused by GPU architecture changes, and probably the code is sub-optimal. If you see GPU execution in TX1 is faster, you can improve the pipeline by better use of streaming and synchronization.

Please see the discussion in other thread:
https://devtalk.nvidia.com/default/topic/978067/performance-comparision-tk1-vs-tx1/[url]
[/url]

Thanks

usajpn · December 12, 2016, 8:19am

Hello kayccc

Thank you for your response.

As I said, TX1 is faster.
Since OpenCV4Tegra is your software,
is there any way you and your development team
can confirm if my result is reasonable?

Thank you.

kayccc · December 12, 2016, 8:54am

Hi usajpn,

The prebuilt OpenCV4GTegra is a CPU & GPU optimized version of OpenCV toward Tegra architecture, suppose the result is reasonable, you could run with standard OpenCV to see if any downgrade as reference.

Besides, are you running TX1 with max performance, here is the link as reference:
[url]http://elinux.org/Jetson/TX1_Controlling_Performance[/url]

And other wiki for OpenCV performance:
[url]http://elinux.org/Jetson/Computer_Vision_Performance[/url]

Thanks

Topic		Replies	Views
Opencv4Tegra GPU vs CPU TK1 vs TX1 Jetson TX1 opencv	3	3671	April 28, 2016
Performance comparision TK1 vs TX1 Jetson TX1	6	3970	October 18, 2021
TX1 slower than TK1 Jetson TX1	5	1314	August 19, 2016
OpenCV Benchmarks: opencv_perf_gpu Jetson TX1 opencv	2	2676	March 28, 2016
Using OpenCV GPU HOG with Jetson TK1 Jetson TK1 opencv	2	3980	July 2, 2015
confused,Our programs run on TX1 is slower than TK1. Jetson TX1	9	1159	October 18, 2021
OpenCV Tegra optimizations needed 60 FPS Jetson TX2	3	508	October 18, 2021
OpenCV gpu modules performs slowly in TK1 Jetson TK1	3	1188	July 11, 2016
TX1 vs TK1 CPU Jetson TX1	7	21038	December 17, 2015
video file I/O Jetson TK1	4	824	June 27, 2016

Comparing TK1 and TX1 GPU specs with OpenCV4Tegra mog2 algorithm

Related topics