Non cryptographic CUDA hash function implementation

I am looking for a non cryptographic hash function that returns an integer as value, and if possible CUDA code for it. I searched GPU hash functions and found most of them to be cryptographic hash functions like these or hash tables which are implemented in GPU. What I am looking for however, is something like “djb2 hash function using CUDA”. Similar queries like these didn’t yield meaningful results.

So, is there a non cryptographic hash function that can be parallelized using CUDA, or is there already existing implementation ? Kindly point to one if it already exists

How much data do you expect to be hashing per 32-bit hash value returned? Simple hash functions of good quality that hash a few words will typically require about 20 instructions and may exploit some instruction-level parallelism but it does not make sense to split them across threads.