Unwrap: shift phase angles kernel implementation

Hello, I’m building an GPU accelerated digital signal processing routine and one of the kernels I’ve been working within that is a phase unwrapping kernel. That is, to feed in a source vector of phase values and whenever there is a jump between two consecutive points of greater or equal to pi add multiples of 2*pi until the jump is less than pi. In an effort to be abundantly clear, I’m trying to build a duplicate of MatLab’s unwrap command as a CUDA kernel.

I have written an implementation that calculates the same results as the MatLab command most of the time. However, with some signals my kernel does not produce exactly the same results as said MatLab unwrap command.

Here is a link to my implementation of an unwrap kernel on GitHub in a text file.

My question is this: are there any open-source kernel implementations of an unwrap function? Barring this, do you have any recommendations for how I might improve on my kernel above?

Thank you for any advice you can give as I unfortunately do not have any mentors in my current organization that are capable of helping me with this task.

Have you looked at the CuPy implementation?