Docker + Lua + Cudnn + Volta 100 + Cuda9, has extremely slow "require cudnn" line

Hi all, I’ve been trying to make lua + cudnn7 + cuda9 cooperate with the Volta 100 GPU (hosted on the Amazon Deep Learning AMI on a p3.2xlarge instance) but have run into a difficult bug. The “require cudnn” line takes a very long time to execute (~10 minutes). Comparable code running on a p2.xlarge (Tesla K80) is almost instantaneous. The rest of the code executes very quickly, it’s just the import cudnn line that struggles.

Does anybody have any insight? I’m also going to post this issue in another place on the forum as this particular section does not seem to get too much foot traffic, I hope that’s okay.

(A github issue with the Docker build and other info is posted here): https://github.com/torch/torch7/issues/1193