I am newly with Hyper-Q technology. I have a project,with only one kernel. My need is to partition the data and run it concurrently using OpenMP or any other technology related to Hyper-Q. Most examples in “Professional Cuda C progrmming book” run a set of kernels concurrently without taking consideration of dividing the data. Can anyone helps me?
this training session will present such a code, both in the training session itself (slide 11) as well as the homework. There is also a recording of the session.
Thanks a lot Robert