NVFortran openmp offloading to multiple GPUs

Sorry it took me a few days, but I have now tested this solution and I am now also getting good performance and correct results. This is great!

Thanks for your help!