Root finding in CUDA

Hi, I am struggling to find a way to solve a simple equation on CUDA: y - atan(y) = 0. I have found cuSolver but it does not allow to consider implicit equations. Is there any library that allows to calculate equation’s root in a convenient way like for example GSL One Dimensional Root-Finding library does?

How many of these equations do you have to solve in parallel?

I have a matrix of 256 x 248 points and in each point i need to resolve this relation. Range of y is [0,1]. Meanwhile I have found this: GitHub - szunami/optimize: Simple cuda optimization library, but it is highly inefficient, what can be seen on plots there.

I am not aware of any CUDA-accelerated library for general root finding nor any published papers. I am aware of a few papers looking into CUDA for parallelized root finding for polynomials of extremely high degree, but don’t know whether the researchers found worthwhile speedup. This doesn’t mean such code might not exist somewhere, just that I am not aware of such work.

Given how long GPU computing and CUDA has been around the lack of publications seems to indicate either one or both of two scenarios: (1) There is hardly any interest in speeding up general root finding, as this is not typically a bottleneck in HPC (2) It is very hard to achieve meaningful speedup for general root finding on CUDA-enabled platforms.

BTW, is the equation above (y - atan(y) = 0) your actual equation? If so, unless I am missing something, the only solution over the reals is y = 0.

Thank you for clarifying me the actual case. To be precise equation posses also some multipliers next to y that change with iteration. I apologize for this mistake, I have tried to state this question in a simple manner.