Parallel machine learning algorithms in CUDA

Are there any resources that describe and explain machine learning algorithms implemented in CUDA C for the purposes of learning?

Check out https://developer.nvidia.com/deep-learning

I’m aware of cuDNN, just that it’s highly optimized code, and it’s not created for readability. I was looking more for a learning resource not really a library to use as a black box.

I think what you are searching for is a good (scientific) book on machine/deep learning. Read that first. I used google and found this link at TECHNION, israel, which also has a recommended reading list:

http://int.technion.ac.il/046195-introduction-to-machine-learning/

If this is of any use https://www.h2o.ai/gpu/