Link errors with nvtxRangePop and nvtxRangePushEx

I’ve been having this problem for awhile and now had some time to create a small testcase to illustrate it.
Here is a source file (testnv.cpp) that will show the problem:

#include <stdio.h> // for printf
#include <unistd.h> // for usleep
#include <nvToolsExt.h>
const uint32_t colors = { 0x00008800, 0x00000088 };
const int num_colors = sizeof(colors)/sizeof(uint32_t);
void psPushIt(const char* name, int cid)
{
int color_id = cid;
color_id = color_id%num_colors;
nvtxEventAttributes_t eventAttrib = {0};
eventAttrib.version = NVTX_VERSION;
eventAttrib.size = NVTX_EVENT_ATTRIB_STRUCT_SIZE;
eventAttrib.colorType = NVTX_COLOR_ARGB;
eventAttrib.color = colors;
eventAttrib.messageType = NVTX_MESSAGE_TYPE_ASCII;
eventAttrib.message.ascii = name;
nvtxRangePushEx(&eventAttrib);
}
void psPopIt()
{
nvtxRangePop();
}
int main( int argc, char** argv )
{
printf(“test 1\n”);
psPushIt(“test1”,0);
// do real work here
usleep(1000); // pretend work
psPopIt();
return 0;
}

Environment is Dell 64-bit with Ubuntu 14.04.3 with g++ version 4.8.4 and CUDA 7.5.18.
Here is the compile command that fails:
g++ -o testnv -I/usr/local/cuda/include -L/usr/local/cuda/lib64 -lnvToolsExt testnv.cpp
/tmp/ccyuqY0W.o: In function ‘psPushIt(char const*, int)’:
testnv.cpp:(.text+0x79): undefined reference to ‘nvtxRangePushEx’
/tmp/ccyuqY0W.o: In function ‘psPopIt()’:
testnv.cpp:(.text+0x84): undefined reference to ‘nvtxRangePop’

If I build this with all the appropriate options for .so file (but not -Wl,-z,defs) the compile/link will succeed but the code will not run without the following env var set:
LD_PRELOAD=/usr/local/cuda/lib64/libnvToolsExt.so

This problem does not occur with CUDA 7.0.
I tried this on several different machines and it only fails on CUDA 7.5.
Did something change here? How do I make this work?

-David

Your linking order is incorrect.

Try:
g++ -o testnv -I/usr/local/cuda/include testnv.cpp -L/usr/local/cuda/lib64 -lnvToolsExt

Ok, that worked.

But why did it work?
Also, what changed after 7.0 that required this specific order of parameters?

-David

The linking order rules are established by the gnu tools. Did you change g++ versions?

Your original compile/link command:

g++ -o testnv -I/usr/local/cuda/include -L/usr/local/cuda/lib64 -lnvToolsExt testnv.cpp

works just fine for me on CUDA 7.5.18/Fedora20/g++ 4.8.3