CUDA on win7 link error

I have a CUDA project that compiles and links fine when a particular global function is templatized.
But if I untemplatize the function I get the link errors complainnig function is already defined in that file.
Both the definition and calling site are in the same file - later is after the former.

  I am trying to simplify a heavily tempated version of radix sort algorithm available here -

  Any ideas what could be going on? I can share the VS2008 project with whoever is interested.