Slow memmove on Jetson Nano

I wrote a small C++ code sample that is hopefully equivalent to the Go program:

// Your First C++ Program

#include <iostream>
#include <chrono>
#include <vector>

int main() {
    std::vector<char> v(1024*1024*100);
    std::vector<char> v2;

    std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
    for (const auto a : v) {
        v2.push_back(a);
    }
    std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
    std::cout << "Needed: " << std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count() << "[ms]" << std::endl;
    return 0;
}

It needs 6,2s, and that is pretty consistent in between runs. That is very different from the Go example.