Dears,
Recently, we find that there is a big difference between the performance of denver core and a57 core on tx2. The same application uses less CPU and takes less calculate time on A57, On Denver, the cpu occupies twice as much as the A57, and the time consumption has tripled.
Our test system environment:
- Jetpack 4.3 [L4T 32.3.1]
- NV Power Mode: MAXN - Type: 0
- ALL Cores performance to 2.0GHz
Code:
#include <iostream>
#include <unistd.h>
#include <math.h>
#include <chrono>
#include <string>
template <typename T>
inline int64_t get_utc_now()
{
auto time_now = std::chrono::time_point_cast<T>(std::chrono::high_resolution_clock::now());
return (time_now.time_since_epoch().count());
}
#define get_utc_s get_utc_now<std::chrono::seconds>
#define get_utc_ms get_utc_now<std::chrono::milliseconds>
#define get_utc_us get_utc_now<std::chrono::microseconds>
#define get_utc_ns get_utc_now<std::chrono::nanoseconds>
int cpu_prime_event(unsigned long long max_prime)
{
unsigned long long c;
unsigned long long l;
double t;
unsigned long long n = 0;
/* So far we're using very simple test prime number tests in 64bit */
for (c = 3; c < max_prime; c++)
{
t = sqrt((double)c);
for (l = 2; l <= t; l++)
if (c % l == 0)
break;
if (l > t)
n++;
}
return 0;
}
int main(int argc, char const *argv[])
{
int maxprime = 200000;
if (argc > 1)
{
maxprime = std::stoi(argv[1]);
}
int count = 1;
while (true)
{
auto t1 = get_utc_us();
cpu_prime_event(maxprime);
auto t2 = get_utc_us();
std::cout << "running ---> " << count++ << ", time: " << t2 - t1 << " us" << std::endl;
usleep(100000);
}
return 0;
}
result on A57(cpu0):
calculate time: ~40026 us
cpu: ~28%
result on Denver(cpu1):
calculate time: ~134600 us
cpu: ~57%
- What are the performance benchmarks of Denver and A57?
- And what are their differences?