Thrust::device_vector Causing a Segmentation Fault in NVTX

I’m hitting a segmentation fault when attempting to create a thrust::device_vector in one of my programs. The segmentation fault is happening in nvtxInit.h on line 401,
entryPointStatus = init_fnptr(NVTX_VERSIONED_IDENTIFIER(nvtxGetExportTable));
It happens at the first instance the code tries to create a device_vector of any type and any size when following best practices defined in the Thrust Quick Start Guide Thrust Quick Start Guide.

I’m trying to run this on a Dell Precision 7780 laptop which is running Ubuntu 22.04 and has an RTX 4000 Ada Generation Laptop with 12GB of vram using CUDA toolkit 12.9 and NVTX3. I’ve also seen this issue with the 12.6 version of the toolkit.

After a little more debugging, I discovered that I could take a small host vector and use that to create a device vector.

thrust::device_vector<int> foo1(1); // Causes seg fault in nvxtInit.h

// these lines work
thrust::host_vector<int> host1(10, 5);
thrust::device_vector<int> foo3 = host1;

Even with this work around, if I try to allocate memory to device_vector or try to do it with a larger host vector, I still have a segmentation fault because nvtxGetExportTable is undefined in nvtxInit.h.

That thrust quick start guide is ancient (7 years old). Thrust has evolved quite a bit since then. If you were using CUDA 9.1 as that guide pertains to, it would probably be sensible guidance.

I don’t seem to have any trouble creating a thrust::device_vector in CUDA 12.8.1

It may help for you to provide the shortest possible complete example that shows the error.