I have a MATLAB script which extensively use the `lsqnonlin`

function a lot of times (independently, you can think of a simulation). Unfortunately, the function is GPU-incompatible. I do not share code because only the mentioned function is contextually essential.

Minimizing a squared functional (what `lsqnonlin`

does) is pretty standard problem from mathematical point of view and I am surprised that I cannot find such function among CUDA Libraries. For example, cuSOLVER supports only linear algebra, but I (and many others) need *non*linear operations.

So I kindly ask you for an advice on a GPU-compatible function that does nonlinear minimization. Am I wrong and there is such function in CUDA Libraries? If not, could you please recommend me such a function as a third-party CUDA library or in other languages such as Python, R, etc.? Thank you a lot in advance.