My hobby CUDA projects revolve around this topic so you gave me an excuse to post;
[url]https://sites.google.com/site/cudapermutations/[/url]
Yes 64 bit integer math is slow on consumer GPUs, but both of the above problems do a great deal of 64 bit integer math and still have great performance. Those benchmarks are a bit old and a single GTX 1080 is about 20% faster for those applications than the GTX Titan X.
Also your CUDA-Z numbers seem a bit low;
[url]http://imgur.com/lkN6by3[/url]
are you on Windows or linux?