nvJPEG decode image slow

I have one 728*478 RGB Jpeg image, and i decoded it in tensorflow with 2.6ms(on CPU)

But when i decode the image with nvJPEG(on TITAN Xp), 2.7-3.8ms used, the time changes even i set warmup:
./nvJPEG -i 728x748.jpg -fmt rgb -o /tmp -w 100 -t 200

I also tried using batch:
./nvJPEG -i 728x748.jpg -fmt rgb -o /tmp -w 100 -b 4 -pipelined -batched -t 200
But 15ms per batch is used(3.75 ms per image)

What should i do to speed up nvJPEG? It seems that something is wrong. By the way, is nvJPEG open source code?