Since the code is incomplete, I can only give a best guess as to the best options for you. Having a full example would be helpful.
Given this, I’d say you probably want to do something like this:
1. ! Add an OpenACC data region
2. do i = 1, times
10 !$acc kernels loop async(1)
11 do j = 1, pver
30. !$acc kernels loop async(2)
31 do k = 1, levels
41 !$acc wait ! use wait here if there is a dependency within times loop
50 end do ! loop i
51 !$acc wait ! use wait here if there isn't a dependency within times loop
60 !$acc end kernels
Since the “times” loop has a dependency, you may not want to offload it to the device. Having an outer sequential loop will cause the loop to be run in “gang-redundant” mode, so every gang will execute the same code. So you’d either need to run only a single gang thus inhibiting performance, or run multiple gangs which would each execute the inner loops redundantly.
Then you can also use async with different queue numbers to have the loop execute concurrently on the device. The placement of the “wait” directive will depend on what else is going on in the times loop and if it needs any of the data brought back from the device.
Hope this helps,