Hello - Basically GPU simulation becomes faster only for larger environments. With smaller simulation size, there is a fixed cost for the GPU simulation.
If you have one Franka then most likely CPU will be the fastest, maybe even a single threaded (set num threads for physics to 0) would be the fastest.