Titan V & OpenCV - Hanging on cv::cuda::CascadeClassifier::detectMultiScale

I am trying to get my software working on a Titan V but am running into a strange issue. Everything works fine on a GTX1080 (even on the same computer), but if I swap for the Titan V (and install appropriate drivers, etc), my code hangs when it calls cv::cuda::CascadeClassifier::detectMultiScale.

During the hang, nvidia-smi shows that GPU is being used. Also, the system feels noticably sluggish while this is going on. Nevertheless, the app will hang forever at that call until I kill it.

I have tried CUDA 8 as well as CUDA 9, and OpenCV 3.3.1 as well as OpenCV 3.4. Other GPU ops appear to work correctly, so not everything is broken.

Here is my code:

// https://github.com/opencv/opencv/blob/master/data/haarcascades_cuda/haarcascade_frontalface_alt.xml
cv::Ptr<cv::cuda::CascadeClassifier> faceDetectorCv = cv::cuda::CascadeClassifier::create("haarcascade_frontalface_alt.xml");

// img is some valid image loaded into a Mat
auto cvGpuImg = cv::cuda::GpuMat(img);
cv::cuda::GpuMat gpuRects;

// Hangs at this call!
faceDetectorCv->detectMultiScale(cvGpuImg, gpuRects);

As I say, this works correctly on a GTX card… Attached is nvidia-bug-report.log.gz in case it is helpful.

Any help would be appreciated!
nvidia-bug-report.log.gz (71.4 KB)