So I tried my best to apply restrict everywhere for each base pointers of my (input and) output structs, still: nvcc generates 2 loads and 2 stores (example here: Compiler Explorer).
If I simply change my kernel to not having structs of arrays (SoAs), but instead using restrict decorated base pointers in the kernel arguments, everything is working as expected (1 ld, 1 store) (example here: Compiler Explorer).
So what do I need to change to inherit restrict to the base pointers of my structs?:)
Cheers,
Klaus