Inline PTX assembly example

I’ve been searching through the forum but haven’t found many good examples for inlining PTX code into CUDA C code. Could anybody provide links/documents on this topic? Thanks a lot.

See here: http://forums.nvidia.com/index.php?showtopic=80020