Performance downgrade on Jetpack 4.2 comparing to Jetpack 3.3 on TX2

Hi,

I have 2 TX2 boards, one is L4T R28.2.1, another is L4T R32.1. I tried to write and test a simple code on both, it ran twice faster on my L4T R28.2.1 than on L4T R32.1.

#include <iostream>
#include <chrono>
#include <functional>

double duration() {
  return std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
}

void benchmark(const std::string &label, std::function<void()> f_action) {
  double start = duration();

  f_action();

  std::cout << label << " " << duration() - start << "ms" << std::endl;
}

int main() {

	int i = 0;
	int b = 0;
	benchmark("test cpu run", [&]() {
		for(; i < 100000000; i++) {
			b+=i;	
			int c = 15;
			c = b + c + i;
		}
	});
	return 0;
}

On R28.2.1: 315ms
On R32.1: 729ms

Both were adjusted by jetson_clocks and using nvpmodel 0. The latest one is supposed to be at least as fast as previous one. I googled but found no similar topics.

Please help me to solve this!

Thinh

hello tpham,

interesting, could you share test binary that we could have quick trial from our side also.
thanks

Hi JerryChange,

Please download binary version here JetpackIssues.zip

Best regards,
Thinh

For me the results are:
$ g++ main.cpp
test cpu run 425ms
test cpu run 375ms
test cpu run 412ms
test cpu run 370ms
test cpu run 357ms
test cpu run 412ms
$ g++ -O3 main.cpp
test cpu run 158ms
test cpu run 147ms
test cpu run 144ms
test cpu run 131ms
test cpu run 154ms

That is Jetpack 4.2, so I get about the same speed as you on 3.3 - slightly slower but there could be other factors. Check tegrastats on 4.2 and see what frequencies you get.

Cannot test 3.3 though.

Hi,

Thanks Dalus for your test. I found a background process was running at testing time, so that could be my problem.

But I still see performance issue. After restarting both TX2s, I tried another test using performance_gpu sample of opencv on both my TX2 (same opencv3.4.0 and were built with same configurations, both are adjusted by jetson_clocks with nvpmodel 0). I got most of tests on JP42 are slower than on JP33, some are even extremely slower.

Please check this attachment, it includes binary for testing and results I got from my tests.
https://workupload.com/file/3473NjvP

this also includes tegrastats logs to make sure both are at maximum performance mode.

I’m considering to downgrade my JP42 TX2 to JP33 to get consistent of my app in development stage, but still hoping to get some ideas to solve this problem for further investigation.

Please help!

Hi tpham,

I used your binary test on JetPack-3.3 and JetPack-4.2, the result looks the same.
test cpu run about 315-318ms

Hi carelyuu,

Yes it’s same now with first simple test (the problem was indicated as I was having a background app at testing time). I added another test with results which I obtained on both TX2, that presents different performance on each JP versions. Do you think the problem is from opencv that is not optimized for JP42?

Well,

I tried again with another sample from GitHub - dhernandez0/sgm: Semi-Global Matching on the GPU, got same results on both TX2. Probably opencv3.4.0 is not optimized for latest Jetpack, I will downgrade my TX2 to continue my work.

Thanks for supporting me.

I also experienced performance degrade in Jetpack 4.2 and hence found this post. When I ran YoloV2 on Jetpack 3.3 and I used get 8-10 FPS and now on Jetpack 4.2 I get hardly 3 FPS. Similarly by running YoloV3 I used to get around 3 FPS in Jetpack 3.3 and on 4.2 the memory gets full and the process dies giving a segmentation fault. I have many more cases where I found this performance issue. I use OpenCV in all my cases. I have tried both OpenCV 3.4.0 as well as OpenCV 4.0.0 and there is no major difference in the performance. It will be very helpful if someone diagnoses this issue.

hello khadse.nukul,

there’re some similar discussion thread for Yolo performance issue.
please also check Topic 1060789, and Topic 1061155 for reference.

however, please have a try to manually reduce the network resolution in first few lines of yolov3.cfg, you might see the performance improvements.
for example,

width=416
height=416

since JetPack-4.3 is now public released, could you please upgrade to the latest JetPack release for confirmation.
you should also initial another discussion thread for further supports,
thanks