Migrating python minuit to c++ cuda

Does anyone have idea where to start with porting python minuit code to c++ cuda?
–posted based on request from a colleague
gpuML-1-1.pdf (2.9 MB)
H_All.txt (796 Bytes)
BD_minuit_minimization_for_GPU.py (6.9 KB)