How to realise 1D Crosscorrelation between tow signals on GPU

We are trying to implement a 1D cross correlation between tow signals of different length.
At the moment we are planing to use NPP’s normalized cross correlation nppiCrossCorrSame… and limit it to one line. But we are not sure if this works as intended and weather there is a better way to realize a 1D cross correlation on the GPU.
I have not found a 1D correlation in the NPP Signal Processing.

Have you realized a similar correlation? Do you have any suggestions?

Thank you very much!