Illegal memory access with unified memory

I’m able to see the issue as well on a GTX 970 (Maxwell). It happened for me after 736 iterations (first try) or 888 iterations (second try). Curiously, if I run it under compute-sanitizer, it runs for at least 3000 iterations without error.

I suggest filing a bug.