map performance - cuda thrust

There are any banchmarks for the thrust map for cuda ? For e.g. the number of keys that can be inserted in a map and the memory footprint ?

I have created a similar data structure that can exploit multiple cores(x86 or SIMD cores) and I get a memory reduction compared with STL - map and also a speedup when it comes at inserting the keys in the data structure.

A description of the data structure can be found here:

and a proof a concept can be also found there.