V4L2 camera stream works on 3/4 csi-ports, Argus (with GStreamer) works on 4/4 csi-ports

You are right! Your diagram hints at the correct solution. That was my mistake.
For me, a more understandable explanation came from this answer, since it describes exactly which components in my device tree I had to modify to make it work.

Interesting it is, that with my mistake, the argus API was working totally fine, so it seems if that is the case people made the same mistake as I did.