I’m able to reproduce the invalid device function error, I happened to be using CUDA 11.4, it appears you are also.
I note that if I drop the -dlto
switches from the first two lines of your compilation sequence, that the error disappears.
My suggestions:
- retest with the latest available CUDA toolchain
- if the problem persists, file a bug