Hello,
I am trying to passthrough Tesla K40m to Virtual Machine(qemu-kvm hypervisor) by vfio.
I download all drivers and CUDA libraries + I compiled all sample files succesfully. However when I run them they run but in the end the do not finish =(. For example run log of deviceQuery:
deviceQuery Starting… CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: “Tesla K40m” //INFO ABOUT IT Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.0, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = Tesla K40m Result = PASS
And then it just hangs, only option is to ctrl+c. Moreover I installed everything on host too and there it finished sucessfully without any problems. Any help will be appreciated.
dmesg on VM says only: [ 1475.225692] nvidia 0000:00:08.0: irq 51 for MSI/MSI-X dmesg on host: kernel: [ 2897.503162] vfio-pci 0000:02:00.0: irq 324 for MSI/MSI-X
Moreover any call to pci is taking too much time, for example I tried to call nvidia-smi in VM and on host system and traced it via strace. Here output from VM:
±-----------------------------------------------------+
| NVIDIA-SMI 346.59 Driver Version: 346.59 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K40m Off | 0000:00:06.0 Off | 0 |
| N/A 54C P0 64W / 235W | 55MiB / 11519MiB | 0% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
% time seconds usecs/call calls errors syscall
98.67 4.688353 275785 17 open
1.08 0.051337 3020 17 close
0.23 0.010722 104 103 ioctl
0.01 0.000261 22 12 read
0.00 0.000235 9 26 mmap
0.00 0.000177 15 12 write
0.00 0.000127 16 8 munmap
0.00 0.000107 11 10 mprotect
0.00 0.000094 19 5 1 stat
0.00 0.000070 5 15 fstat
0.00 0.000055 8 7 7 access
0.00 0.000030 30 1 execve
0.00 0.000018 5 4 fcntl
0.00 0.000015 8 2 1 futex
0.00 0.000013 4 3 brk
0.00 0.000007 4 2 rt_sigaction
0.00 0.000006 6 1 getrlimit
0.00 0.000005 5 1 lseek
0.00 0.000004 4 1 set_robust_list
0.00 0.000003 3 1 rt_sigprocmask
0.00 0.000003 3 1 arch_prctl
0.00 0.000003 3 1 set_tid_address
100.00 4.751645 250 9 total
Here is output when I run nvidia-smi from host (I de-attach it from VM beforehand)
±-----------------------------------------------------+
| NVIDIA-SMI 346.59 Driver Version: 346.59 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K40m Off | 0000:02:00.0 Off | 0 |
| N/A 48C P0 64W / 235W | 55MiB / 11519MiB | 60% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
% time seconds usecs/call calls errors syscall
82.25 0.571723 33631 17 open
15.70 0.109104 6418 17 close
1.76 0.012264 119 103 ioctl
0.10 0.000664 44 15 read
0.05 0.000370 14 26 mmap
0.02 0.000155 16 10 mprotect
0.02 0.000152 22 7 7 access
0.02 0.000134 9 15 fstat
0.01 0.000100 13 8 munmap
0.01 0.000078 26 3 brk
0.01 0.000070 6 12 write
0.01 0.000069 17 4 fcntl
0.01 0.000062 62 1 execve
0.00 0.000029 6 5 1 stat
0.00 0.000021 11 2 rt_sigaction
0.00 0.000021 11 2 1 futex
0.00 0.000010 10 1 rt_sigprocmask
0.00 0.000010 10 1 getrlimit
0.00 0.000010 10 1 arch_prctl
0.00 0.000010 10 1 set_tid_address
0.00 0.000009 9 1 set_robust_list
0.00 0.000000 0 1 lseek
100.00 0.695065 253 9 total
As you can see “open” from VM takes too much time. I have no idea why.
Can anybody help me? Sorry for so much text