I try to run cuda application on an aws instance g3.4xlarge of Centos 6 OS
and get this problem.
The driver version which is installed is 390.116 (to support Cuda 9.1).
What the problem might be?
A bug report can be downloaded from https://drive.google.com/file/d/1Hkuw1IhuWKwDZxvcuYbsPU49jKalYYkb/view?usp=sharing
[ 65.187695] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts:
[ 229.531924] [TTM] Finalizing pool allocator
[ 229.531935] [TTM] Finalizing DMA pool allocator
[ 229.531982] [TTM] Zone kernel: Used memory at exit: 0 kiB
[ 229.531986] [TTM] Zone dma32: Used memory at exit: 0 kiB
[ 229.532425] nouveau 0000:00:1e.0: priv: HUB0: 10ecc0 ffffffff (1a40822c)
[ 229.535493] [drm] Module unloaded
[ 229.619820] ACPI: WMI: Mapper unloaded
[ 256.272426] nvidia-nvlink: Nvlink Core is being initialized, major device number 246
[ 256.273335] vgaarb: device changed decodes: PCI:0000:00:1e.0,olddecodes=io+mem,decodes=none:owns=io+mem
[ 256.273337] vgaarb: transferring owner from PCI:0000:00:1e.0 to PCI:0000:00:02.0
[ 256.273518] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.116 Sun Jan 27 07:21:36 PST 2019 (using threaded interrupts)
[ 256.287070] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.116 Sun Jan 27 06:30:32 PST 2019
[ 256.319857] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 245
[ 283.407182] nvidia 0000:00:1e.0: irq 48 for MSI/MSI-X
[ 284.006075] do_IRQ: 0.89 No irq handler for vector (irq -1)
[ 288.438542] do_IRQ: 0.89 No irq handler for vector (irq -1)
[ 288.506835] NVRM: RmInitAdapter failed! (0x12:0x45:1920)
[ 288.508268] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 317.882509] nvidia 0000:00:1e.0: irq 48 for MSI/MSI-X
[ 318.677856] do_IRQ: 0.89 No irq handler for vector (irq -1)
[ 322.727928] do_IRQ: 0.89 No irq handler for vector (irq -1)
[ 322.997075] NVRM: RmInitAdapter failed! (0x12:0x45:1920)
[ 322.998506] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 323.001000] nvidia 0000:00:1e.0: irq 48 for MSI/MSI-X
[ 323.795884] do_IRQ: 0.89 No irq handler for vector (irq -1)
[ 327.845946] do_IRQ: 0.89 No irq handler for vector (irq -1)
[ 328.115908] NVRM: RmInitAdapter failed! (0x12:0x45:1920)
[ 328.117342] NVRM: rm_init_adapter failed for device bearing minor number 0