LUT function for 16 bit image in Nvidia performance primitives

There is a function nppiLUT_16u_C1R for 16 bit image. 16 bit image has 65536 brightness levels. But the function supports 1024 levels maximum, according to nvidia documentation(NVIDIA 2D Image And Signal Performance Primitives (NPP): ColorLUT). I guess there is some sort of tricky method to by pass such limitation. But I did not manage to find any example of implementation of the function. Could you please help me?