C++ example to invoke same thread across multiple cores with different data set

ayewinoung · June 11, 2020, 12:28am

Can anyone point me to tutorial or examples of how I can port my existing C/C++ threads to CUDA cores? Also is there specification how big, program code, not data, should be for optimal porting to CUDA ?

Most of the examples are on how to distribute the processing on big data set. Rather than distributing threads into Cores, sorry if I’m not making sense.

Thanks in advance.