[SOLVED] How to deploy CUDA application to another machine?

Hello!

I have one machine on which I will be develop applications by using CUDA Toolkit.
Currently I have next environment:

$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [GeForce GT 430] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GF108 High Definition Audio Controller (rev a1)
$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  390.48  Thu Mar 22 00:42:57 PDT 2018
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

To verify my environment after installation I built deviceQuery example and checked it:

$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GT 430"
  CUDA Driver Version / Runtime Version          9.1 / 8.0
  CUDA Capability Major/Minor version number:    2.1
  Total amount of global memory:                 1980 MBytes (2076508160 bytes)
  ( 2) Multiprocessors, ( 48) CUDA Cores/MP:     96 CUDA Cores
  GPU Max Clock rate:                            1460 MHz (1.46 GHz)
  Memory Clock rate:                             667 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 131072 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GT 430
Result = PASS

Then I move the deviceQuery app to another machine with next environment:

$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  375.82  Wed Jul 19 21:16:49 PDT 2017
GCC version:  gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)

and run the deviceQuery app but got this error:

$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

I also copy lib folder from /usr/local/cuda-8.0/ alongside the deviceQuery app and run next command:

LD_LIBRARY_PATH=lib64:$LD_LIBRARY_PATH ./deviceQuery

but I still get the same error.

So, what I need to do for running CUDA applications on other machines, not only on development environment?

I updated code for deviceQuery.cpp little bit to show information about device count

. . .
    int deviceCount = 0;
    cudaError_t error_id = cudaGetDeviceCount(&deviceCount);

    printf("Device count = %d\n\n", deviceCount);  // added from my side

    if (error_id != cudaSuccess)
    {
        printf("cudaGetDeviceCount returned %d\n-> %s\n", (int)error_id, cudaGetErrorString(error_id));
        printf("Result = FAIL\n");
        exit(EXIT_FAILURE);
    }

. . .

on my machine I get “Device count = 1”, but on other machine I get “Device count = 0”, so problem here.
But I’m not understand, why device count is 0, because “$ lspci | grep -i nvidia” command from previous post returns two devices?

So, I enter next commands:

$ cat /proc/driver/nvidia/gpus/0000:01:00.0/information
Model: 		 GeForce GTX 1080 Ti
IRQ:   		 136
GPU UUID: 	 GPU-????????-????-????-????-????????????
Video BIOS: 	 ??.??.??.??.??
Bus Type: 	 PCIe
DMA Size: 	 47 bits
DMA Mask: 	 0x7fffffffffff
Bus Location: 	 0000:01:00.0
Device Minor: 	 0

$ cat /proc/driver/nvidia/gpus/0000:02:00.0/information
Model: 		 GeForce GTX 1080 Ti
IRQ:   		 17
GPU UUID: 	 GPU-????????-????-????-????-????????????
Video BIOS: 	 ??.??.??.??.??
Bus Type: 	 PCIe
DMA Size: 	 36 bits
DMA Mask: 	 0xfffffffff
Bus Location: 	 0000:02:00.0
Device Minor: 	 1

Is this correct? Maybe devices is not installed correctly? If so, how can I fix this issue?

And few more commands:

$ lspci -vnn | grep VGA -A 12
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1b06] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device [1043:85f1]
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at ee000000 (32-bit, non-prefetchable) 
	Memory at d0000000 (64-bit, prefetchable) 
	Memory at e0000000 (64-bit, prefetchable) 
	I/O ports at e000 
	[virtual] Expansion ROM at 000c0000 [disabled] 
	Capabilities: <access denied>
	Kernel driver in use: nvidia
	Kernel modules: nvidia

01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10ef] (rev a1)
--
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1b06] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device [1043:85f1]
	Flags: bus master, fast devsel, latency 0, IRQ 17
	Memory at ec000000 (32-bit, non-prefetchable) 
	Memory at b0000000 (64-bit, prefetchable) 
	Memory at c0000000 (64-bit, prefetchable) 
	I/O ports at d000 
	Expansion ROM at ed000000 [disabled] 
	Capabilities: <access denied>
	Kernel driver in use: nvidia
	Kernel modules: nvidia

02:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10ef] (rev a1)
$ egrep -i " connected|card detect|primary dev" /var/log/Xorg.0.log
[   414.560] (EE) NVIDIA(0): Failed to assign any connected display devices to X screen 0.

Here is content of Xorg.0.log file:

$ cat /var/log/Xorg.0.log
[   414.007] (--) Log file renamed from "/var/log/Xorg.pid-17094.log" to "/var/log/Xorg.0.log"
[   414.007] 
X.Org X Server 1.19.2
Release Date: 2017-03-02
[   414.007] X Protocol Version 11, Revision 0
[   414.007] Build Operating System: Linux 4.9.0-4-amd64 x86_64 Debian
[   414.007] Current Operating System: Linux debian 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64
[   414.007] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.9.0-6-amd64 root=UUID=a4a6019a-5c26-42db-a251-89e926b57f26 ro quiet
[   414.007] Build Date: 16 October 2017  08:19:45AM
[   414.007] xorg-server 2:1.19.2-1+deb9u2 (https://www.debian.org/support) 
[   414.007] Current version of pixman: 0.34.0
[   414.007] 	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
[   414.007] Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   414.007] (==) Log file: "/var/log/Xorg.0.log", Time: Thu Apr 12 09:20:36 2018
[   414.008] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[   414.008] (==) No Layout section.  Using the first Screen section.
[   414.008] (==) No screen section available. Using defaults.
[   414.008] (**) |-->Screen "Default Screen Section" (0)
[   414.008] (**) |   |-->Monitor "<default monitor>"
[   414.008] (==) No monitor specified for screen "Default Screen Section".
	Using a default monitor configuration.
[   414.008] (==) Automatically adding devices
[   414.008] (==) Automatically enabling devices
[   414.008] (==) Automatically adding GPU devices
[   414.008] (==) Max clients allowed: 256, resource mask: 0x1fffff
[   414.008] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[   414.008] 	Entry deleted from font path.
[   414.008] (==) FontPath set to:
	/usr/share/fonts/X11/misc,
	/usr/share/fonts/X11/100dpi/:unscaled,
	/usr/share/fonts/X11/75dpi/:unscaled,
	/usr/share/fonts/X11/Type1,
	/usr/share/fonts/X11/100dpi,
	/usr/share/fonts/X11/75dpi,
	built-ins
[   414.008] (==) ModulePath set to "/usr/lib/xorg/modules"
[   414.008] (II) The server relies on udev to provide the list of input devices.
	If no devices become available, reconfigure udev or disable AutoAddDevices.
[   414.008] (II) Loader magic: 0x55a89453fe00
[   414.008] (II) Module ABI versions:
[   414.008] 	X.Org ANSI C Emulation: 0.4
[   414.008] 	X.Org Video Driver: 23.0
[   414.008] 	X.Org XInput driver : 24.1
[   414.008] 	X.Org Server Extension : 10.0
[   414.008] (++) using VT number 1

[   414.009] (II) systemd-logind: took control of session /org/freedesktop/login1/session/c504
[   414.009] (II) xfree86: Adding drm device (/dev/dri/card0)
[   414.010] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 12 paused 0
[   414.010] (II) xfree86: Adding drm device (/dev/dri/card1)
[   414.010] (II) systemd-logind: got fd for /dev/dri/card1 226:1 fd 13 paused 0
[   414.011] (--) PCI:*(0:1:0:0) 10de:1b06:1043:85f1 rev 161, Mem @ 0xee000000/16777216, 0xd0000000/268435456, 0xe0000000/33554432, I/O @ 0x0000e000/128, BIOS @ 0x????????/131072
[   414.011] (--) PCI: (0:2:0:0) 10de:1b06:1043:85f1 rev 161, Mem @ 0xec000000/16777216, 0xb0000000/268435456, 0xc0000000/33554432, I/O @ 0x0000d000/128, BIOS @ 0x????????/524288
[   414.011] (II) LoadModule: "glx"
[   414.011] (II) Loading /usr/lib/xorg/modules/linux/libglx.so
[   414.013] (II) Module glx: vendor="NVIDIA Corporation"
[   414.013] 	compiled for 4.0.2, module version = 1.0.0
[   414.013] 	Module class: X.Org Server Extension
[   414.013] (II) NVIDIA GLX Module  375.82  Wed Jul 19 20:30:13 PDT 2017
[   414.013] (II) Applying OutputClass "nvidia" to /dev/dri/card0
[   414.013] 	loading driver: nvidia
[   414.013] (II) Applying OutputClass "nvidia" to /dev/dri/card1
[   414.013] 	loading driver: nvidia
[   414.013] (==) Matched nvidia as autoconfigured driver 0
[   414.013] (==) Matched nouveau as autoconfigured driver 1
[   414.013] (==) Matched nv as autoconfigured driver 2
[   414.013] (==) Matched nvidia as autoconfigured driver 3
[   414.013] (==) Matched nouveau as autoconfigured driver 4
[   414.013] (==) Matched nv as autoconfigured driver 5
[   414.013] (==) Matched nouveau as autoconfigured driver 6
[   414.013] (==) Matched nv as autoconfigured driver 7
[   414.013] (==) Matched modesetting as autoconfigured driver 8
[   414.013] (==) Matched fbdev as autoconfigured driver 9
[   414.013] (==) Matched vesa as autoconfigured driver 10
[   414.013] (==) Assigned the driver to the xf86ConfigLayout
[   414.013] (II) LoadModule: "nvidia"
[   414.013] (II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
[   414.013] (II) Module nvidia: vendor="NVIDIA Corporation"
[   414.013] 	compiled for 4.0.2, module version = 1.0.0
[   414.013] 	Module class: X.Org Video Driver
[   414.013] (II) LoadModule: "nouveau"
[   414.013] (II) Loading /usr/lib/xorg/modules/drivers/nouveau_drv.so
[   414.014] (II) Module nouveau: vendor="X.Org Foundation"
[   414.014] 	compiled for 1.19.3, module version = 1.0.13
[   414.014] 	Module class: X.Org Video Driver
[   414.014] 	ABI class: X.Org Video Driver, version 23.0
[   414.014] (II) LoadModule: "nv"
[   414.014] (WW) Warning, couldn't open module nv
[   414.014] (II) UnloadModule: "nv"
[   414.014] (II) Unloading nv
[   414.014] (EE) Failed to load module "nv" (module does not exist, 0)
[   414.014] (II) LoadModule: "modesetting"
[   414.014] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
[   414.014] (II) Module modesetting: vendor="X.Org Foundation"
[   414.014] 	compiled for 1.19.2, module version = 1.19.2
[   414.014] 	Module class: X.Org Video Driver
[   414.014] 	ABI class: X.Org Video Driver, version 23.0
[   414.014] (II) LoadModule: "fbdev"
[   414.014] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so
[   414.014] (II) Module fbdev: vendor="X.Org Foundation"
[   414.014] 	compiled for 1.19.0, module version = 0.4.4
[   414.014] 	Module class: X.Org Video Driver
[   414.014] 	ABI class: X.Org Video Driver, version 23.0
[   414.014] (II) LoadModule: "vesa"
[   414.014] (II) Loading /usr/lib/xorg/modules/drivers/vesa_drv.so
[   414.014] (II) Module vesa: vendor="X.Org Foundation"
[   414.014] 	compiled for 1.19.0, module version = 2.3.4
[   414.014] 	Module class: X.Org Video Driver
[   414.014] 	ABI class: X.Org Video Driver, version 23.0
[   414.014] (II) NVIDIA dlloader X Driver  375.82  Wed Jul 19 20:05:50 PDT 2017
[   414.014] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[   414.014] (II) NOUVEAU driver Date:   Tue Sep 20 00:31:06 2016 -0400
[   414.014] (II) NOUVEAU driver for NVIDIA chipset families :
[   414.014] 	RIVA TNT        (NV04)
[   414.014] 	RIVA TNT2       (NV05)
[   414.014] 	GeForce 256     (NV10)
[   414.014] 	GeForce 2       (NV11, NV15)
[   414.014] 	GeForce 4MX     (NV17, NV18)
[   414.014] 	GeForce 3       (NV20)
[   414.014] 	GeForce 4Ti     (NV25, NV28)
[   414.014] 	GeForce FX      (NV3x)
[   414.014] 	GeForce 6       (NV4x)
[   414.014] 	GeForce 7       (G7x)
[   414.014] 	GeForce 8       (G8x)
[   414.014] 	GeForce GTX 200 (NVA0)
[   414.014] 	GeForce GTX 400 (NVC0)
[   414.014] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
[   414.014] (II) FBDEV: driver for framebuffer: fbdev
[   414.014] (II) VESA: driver for VESA chipsets: vesa
[   414.014] (II) systemd-logind: releasing fd for 226:0
[   414.014] (II) systemd-logind: releasing fd for 226:1
[   414.015] (II) Loading sub module "fb"
[   414.015] (II) LoadModule: "fb"
[   414.015] (II) Loading /usr/lib/xorg/modules/libfb.so
[   414.015] (II) Module fb: vendor="X.Org Foundation"
[   414.015] 	compiled for 1.19.2, module version = 1.0.0
[   414.015] 	ABI class: X.Org ANSI C Emulation, version 0.4
[   414.015] (II) Loading sub module "wfb"
[   414.015] (II) LoadModule: "wfb"
[   414.015] (II) Loading /usr/lib/xorg/modules/libwfb.so
[   414.015] (II) Module wfb: vendor="X.Org Foundation"
[   414.015] 	compiled for 1.19.2, module version = 1.0.0
[   414.015] 	ABI class: X.Org ANSI C Emulation, version 0.4
[   414.015] (II) Loading sub module "ramdac"
[   414.015] (II) LoadModule: "ramdac"
[   414.015] (II) Module "ramdac" already built-in
[   414.015] (EE) [drm] Failed to open DRM device for (null): -22
[   414.015] (WW) Falling back to old probe method for modesetting
[   414.015] (WW) Falling back to old probe method for fbdev
[   414.015] (II) Loading sub module "fbdevhw"
[   414.015] (II) LoadModule: "fbdevhw"
[   414.015] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so
[   414.015] (II) Module fbdevhw: vendor="X.Org Foundation"
[   414.015] 	compiled for 1.19.2, module version = 0.0.2
[   414.015] 	ABI class: X.Org Video Driver, version 23.0
[   414.015] (WW) Falling back to old probe method for vesa
[   414.015] (II) NVIDIA(0): Creating default Display subsection in Screen section
	"Default Screen Section" for depth/fbbpp 24/32
[   414.015] (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
[   414.015] (==) NVIDIA(0): RGB weight 888
[   414.015] (==) NVIDIA(0): Default visual is TrueColor
[   414.015] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[   414.015] (**) NVIDIA(0): Enabling 2D acceleration
[   414.559] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:1:0:0
[   414.559] (--) NVIDIA(0):     DFP-0 (boot)
[   414.559] (--) NVIDIA(0):     DFP-1
[   414.559] (--) NVIDIA(0):     DFP-2
[   414.559] (--) NVIDIA(0):     DFP-3
[   414.559] (--) NVIDIA(0):     DFP-4
[   414.559] (--) NVIDIA(0):     DFP-5
[   414.559] (--) NVIDIA(0):     DFP-6
[   414.559] (II) NVIDIA(0): NVIDIA GPU GeForce GTX 1080 Ti (GP102-A) at PCI:1:0:0 (GPU-0)
[   414.559] (--) NVIDIA(0): Memory: 11534336 kBytes
[   414.559] (--) NVIDIA(0): VideoBIOS: 86.02.39.00.54
[   414.559] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[   414.559] (--) NVIDIA(GPU-0): DFP-0: disconnected
[   414.559] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[   414.560] (--) NVIDIA(GPU-0): DFP-0: 330.0 MHz maximum pixel clock
[   414.560] (--) NVIDIA(GPU-0): 
[   414.560] (--) NVIDIA(GPU-0): DFP-1: disconnected
[   414.560] (--) NVIDIA(GPU-0): DFP-1: Internal TMDS
[   414.560] (--) NVIDIA(GPU-0): DFP-1: 330.0 MHz maximum pixel clock
[   414.560] (--) NVIDIA(GPU-0): 
[   414.560] (--) NVIDIA(GPU-0): DFP-2: disconnected
[   414.560] (--) NVIDIA(GPU-0): DFP-2: Internal TMDS
[   414.560] (--) NVIDIA(GPU-0): DFP-2: 330.0 MHz maximum pixel clock
[   414.560] (--) NVIDIA(GPU-0): 
[   414.560] (--) NVIDIA(GPU-0): DFP-3: disconnected
[   414.560] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
[   414.560] (--) NVIDIA(GPU-0): DFP-3: 1440.0 MHz maximum pixel clock
[   414.560] (--) NVIDIA(GPU-0): 
[   414.560] (--) NVIDIA(GPU-0): DFP-4: disconnected
[   414.560] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[   414.560] (--) NVIDIA(GPU-0): DFP-4: 330.0 MHz maximum pixel clock
[   414.560] (--) NVIDIA(GPU-0): 
[   414.560] (--) NVIDIA(GPU-0): DFP-5: disconnected
[   414.560] (--) NVIDIA(GPU-0): DFP-5: Internal DisplayPort
[   414.560] (--) NVIDIA(GPU-0): DFP-5: 1440.0 MHz maximum pixel clock
[   414.560] (--) NVIDIA(GPU-0): 
[   414.560] (--) NVIDIA(GPU-0): DFP-6: disconnected
[   414.560] (--) NVIDIA(GPU-0): DFP-6: Internal TMDS
[   414.560] (--) NVIDIA(GPU-0): DFP-6: 330.0 MHz maximum pixel clock
[   414.560] (--) NVIDIA(GPU-0): 
[   414.560] (EE) NVIDIA(0): Failed to assign any connected display devices to X screen 0. 
[   414.560] (EE) NVIDIA(0):     Set AllowEmptyInitialConfiguration if you want the server
[   414.560] (EE) NVIDIA(0):     to start anyway
[   414.560] (EE) NVIDIA(0): Failing initialization of X screen 0
[   415.355] (II) UnloadModule: "nvidia"
[   415.355] (II) UnloadSubModule: "wfb"
[   415.355] (II) UnloadSubModule: "fb"
[   415.355] (EE) Screen(s) found, but none have a usable configuration.
[   415.355] (EE) 
Fatal server error:
[   415.355] (EE) no screens found(EE) 
[   415.355] (EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[   415.355] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   415.355] (EE) 
[   415.357] (EE) Server terminated with error (1). Closing log file.

And nvidia-smi output:

$ nvidia-smi
Fri Apr 13 05:18:18 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.82                 Driver Version: 375.82                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 0000:01:00.0     Off |                  N/A |
|  0%   34C    P8     9W / 250W |     10MiB / 11170MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 0000:02:00.0     Off |                  N/A |
|  0%   30C    P8    11W / 250W |      9MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       751    G   /usr/lib/xorg/Xorg                               8MiB |
|    1       751    G   /usr/lib/xorg/Xorg                               7MiB |
+-----------------------------------------------------------------------------+

Sorry, but I forgot about that, the second machine is remote machine to which I have access by ssh :-)

Here is content of xorg.conf file:

$ cat xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 384.111  (buildd@debian)  Sun Feb 25 23:27:00 UTC 2018


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    Screen      1  "Screen1" RightOf "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 1080 Ti"
    BusID          "PCI:1:0:0"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 1080 Ti"
    BusID          "PCI:2:0:0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

I removed driver which I installed from Debian repository and installed another one from NVIDIA (version 390.48). Also I reconfigured X server. Now CUDA samples work.