CUDA Kernel error: "The launch timed out and was terminated"

This is an issue I’ve been attempting to understand and find a work around for a while now on my jetson Nano 2gb.

I’ve modified the xorg file to include the option “Interactive” “0”
I’ve gone into sysctl and turned
watchdog=0
nmi_watchdog=0
watchdog_soft=0
watchdog_thresh=0

I’ve attempted to remove rtkit that comes preinstalled in the image.

I’ve attempted to run in init 3 to avoid the use of xorg

I honestly do not know what else to try, In the xorg log files I do find an error. this is the file.

Any suggestion, clarity, or direction you could provide me would be appreciated.

[ 7.107]
X. Org X Server 1.19.6
Release Date: 2017-12-20
[ 7.107] X Protocol Version 11, Revision 0
[ 7.107] Build Operating System: Linux 4.15.0-124-generic aarch64 Ubuntu
[ 7.107] Current Operating System: Linux bowl 4.9.201-tegra #1 SMP PREEMPT Fri Jan 15 14:41:02 PST 2021 aarch64
[ 7.107] Kernel command line: tegraid=21.1.2.0.0 ddr_die=2048M@2048M section=256M memtype=0 vpr_resize usb_port_owner_info=0 lane_owner_info=0 emc_max_dvfs=0 touch_id=0@63 video=tegrafb no_console_suspend=1 console=ttyS0,115200n8 debug_uartport=lsport,4 earlyprintk=uart8250-32bit,0x70006000 maxcpus=4 usbcore.old_scheme_first=1 core_edp_mv=1125 core_edp_ma=4000 gpt earlycon=uart8250,mmio32,0x70006000 root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 quiet root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0
[ 7.108] Build Date: 30 November 2020 08:05:15PM
[ 7.108] xorg-server 2:1.19.6-1ubuntu4.8 (For technical support please see
[ 7.108] Current version of pixman: 0.34.0
[ 7.108] Before reporting problems, check
to make sure that you have the latest version.
[ 7.108] Markers: (–) probed, () from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[ 7.108] (==) Log file: “/var/log/Xorg.0.log”, Time: Wed Feb 24 04:29:03 2021
[ 7.145] (==) Using config file: “/etc/X11/xorg.conf”
[ 7.145] (==) Using system config directory “/usr/share/X11/xorg.conf.d”
[ 7.159] (==) No Layout section. Using the first Screen section.
[ 7.159] (==) No screen section available. Using defaults.
[ 7.159] (
) |–>Screen “Default Screen Section” (0)
[ 7.159] () | |–>Monitor “”
[ 7.160] (==) No device specified for screen “Default Screen Section”.
Using the first device section listed.
[ 7.160] (
) | |–>Device “Tegra0”
[ 7.160] (==) No monitor specified for screen “Default Screen Section”.
Using a default monitor configuration.
[ 7.160] (==) Automatically adding devices
[ 7.160] (==) Automatically enabling devices
[ 7.160] (==) Automatically adding GPU devices
[ 7.160] (==) Automatically binding GPU devices
[ 7.160] (==) Max clients allowed: 256, resource mask: 0x1fffff
[ 7.182] (WW) The directory “/usr/share/fonts/X11/cyrillic” does not exist.
[ 7.182] Entry deleted from font path.
[ 7.182] (WW) The directory “/usr/share/fonts/X11/100dpi/” does not exist.
[ 7.182] Entry deleted from font path.
[ 7.182] (WW) The directory “/usr/share/fonts/X11/75dpi/” does not exist.
[ 7.182] Entry deleted from font path.
[ 7.193] (WW) The directory “/usr/share/fonts/X11/100dpi” does not exist.
[ 7.193] Entry deleted from font path.
[ 7.193] (WW) The directory “/usr/share/fonts/X11/75dpi” does not exist.
[ 7.193] Entry deleted from font path.
[ 7.193] (==) FontPath set to:
/usr/share/fonts/X11/misc,
/usr/share/fonts/X11/Type1,
built-ins
[ 7.193] (==) ModulePath set to “/usr/lib/xorg/modules”
[ 7.193] (II) The server relies on udev to provide the list of input devices.
If no devices become available, reconfigure udev or disable AutoAddDevices.
[ 7.193] (II) Loader magic: 0x555b420010
[ 7.193] (II) Module ABI versions:
[ 7.193] X .Org ANSI C Emulation: 0.4
[ 7.193] X .Org Video Driver: 23.0
[ 7.193] X .Org XInput driver : 24.1
[ 7.193] X .Org Server Extension : 10.0
[ 7.195] (++) using VT number 7

[ 7.195] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration
[ 7.196] (II) no primary bus or device found
[ 7.196] (WW) “dri” will not be loaded unless you’ve specified it to be loaded elsewhere.
[ 7.196] (II) “glx” will be loaded by default.
[ 7.196] (II) LoadModule: “extmod”
[ 7.196] (II) Module “extmod” already built-in
[ 7.196] (II) LoadModule: “glx”
[ 7.273] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[ 7.394] (II) Module glx: vendor=“X .Org Foundation”
[ 7.394] compiled for 1.19.6, module version = 1.0.0
[ 7.394] ABI class: X .Org Server Extension, version 10.0
[ 7.394] (II) LoadModule: “nvidia”
[ 7.395] (II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
[ 7.466] (II) Module nvidia: vendor=“NVIDIA Corporation”
[ 7.466] compiled for 4.0.2, module version = 1.0.0
[ 7.466] Module class: X .Org Video Driver
[ 7.485] (II) NVIDIA dlloader X Driver 32.5.0 Release Build (integ_stage_rel) (buildbrain@mobile-u64-4415) Fri Jan 15 14:44:38 PST 2021
[ 7.485] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[ 7.486] (WW) Falling back to old probe method for NVIDIA
[ 7.486] (II) Loading sub module “fb”
[ 7.486] (II) LoadModule: “fb”
[ 7.487] (II) Loading /usr/lib/xorg/modules/libfb.so
[ 7.501] (II) Module fb: vendor=“X .Org Foundation”
[ 7.501] compiled for 1.19.6, module version = 1.0.0
[ 7.501] ABI class: X .Org ANSI C Emulation, version 0.4
[ 7.501] (II) Loading sub module “wfb”
[ 7.501] (II) LoadModule: “wfb”
[ 7.501] (II) Loading /usr/lib/xorg/modules/libwfb.so
[ 7.533] (II) Module wfb: vendor=“X .Org Foundation”
[ 7.533] compiled for 1.19.6, module version = 1.0.0
[ 7.533] ABI class: X .Org ANSI C Emulation, version 0.4
[ 7.533] (II) Loading sub module “ramdac”
[ 7.533] (II) LoadModule: “ramdac”
[ 7.533] (II) Module “ramdac” already built-in
[ 7.534] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[ 7.535] (II) NVIDIA(0): Creating default Display subsection in Screen section
“Default Screen Section” for depth/fbbpp 24/32
[ 7.535] (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
[ 7.535] (==) NVIDIA(0): RGB weight 888
[ 7.535] (==) NVIDIA(0): Default visual is TrueColor
[ 7.535] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[ 7.535] (DB) xf86MergeOutputClassOptions unsupported bus type 0
[ 7.535] () NVIDIA(0): Option “AllowEmptyInitialConfiguration” “true”
[ 7.535] (
) NVIDIA(0): Option “Interactive” “false”
[ 7.535] () NVIDIA(0): Enabling 2D acceleration
[ 7.535] (II) Loading sub module “glxserver_nvidia”
[ 7.535] (II) LoadModule: “glxserver_nvidia”
[ 7.535] (II) Loading /usr/lib/xorg/modules/extensions/libglxserver_nvidia.so
[ 7.799] (II) Module glxserver_nvidia: vendor=“NVIDIA Corporation”
[ 7.799] compiled for 4.0.2, module version = 1.0.0
[ 7.799] Module class: X .Org Server Extension
[ 7.799] (II) NVIDIA GLX Module 32.5.0 Release Build (integ_stage_rel) (buildbrain@mobile-u64-4415) Fri Jan 15 14:43:01 PST 2021
[ 7.840] (–) NVIDIA(0): Valid display device(s) on GPU-0 at SoC
[ 7.840] (–) NVIDIA(0): DFP-0
[ 7.841] (II) NVIDIA(0): NVIDIA GPU NVIDIA Tegra X1 (nvgpu) (GM20B) at SoC (GPU-0)
[ 7.841] (–) NVIDIA(0): Memory: 2027380 kBytes
[ 7.841] (–) NVIDIA(0): VideoBIOS:
[ 7.841] (==) NVIDIA(0):
[ 7.841] (==) NVIDIA(0): No modes were requested; the default mode “nvidia-auto-select”
[ 7.841] (==) NVIDIA(0): will be used as the requested mode.
[ 7.841] (==) NVIDIA(0):
[ 7.841] (–) NVIDIA(0): No enabled display devices found; starting anyway because
[ 7.841] (–) NVIDIA(0): AllowEmptyInitialConfiguration is enabled
[ 7.841] (II) NVIDIA(0): Validated MetaModes:
[ 7.841] (II) NVIDIA(0): “NULL”
[ 7.841] (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
[ 7.841] (WW) NVIDIA(0): Unable to get display device for DPI computation.
[ 7.841] (==) NVIDIA(0): DPI set to (75, 75); computed from built-in default
[ 7.841] (–) Depth 24 pixmap format is 32 bpp
[ 7.841] (II) NVIDIA: Reserving 6144.00 MB of virtual memory for indirect memory
[ 7.841] (II) NVIDIA: access.
[ 7.843] (EE) NVIDIA(0): Failed to allocate NVIDIA Error Handler
[ 7.844] (II) NVIDIA(0): ACPI: failed to connect to the ACPI event daemon; the daemon
[ 7.844] (II) NVIDIA(0): may not be running or the “AcpidSocketPath” X
[ 7.844] (II) NVIDIA(0): configuration option may not be set correctly. When the
[ 7.844] (II) NVIDIA(0): ACPI event daemon is available, the NVIDIA X driver will
[ 7.844] (II) NVIDIA(0): try to use it to receive ACPI event notifications. For
[ 7.844] (II) NVIDIA(0): details, please see the “ConnectToAcpid” and
[ 7.844] (II) NVIDIA(0): “AcpidSocketPath” X configuration options in Appendix B: X
[ 7.844] (II) NVIDIA(0): Config Options in the README.
[ 7.895] (II) NVIDIA(0): Setting mode “NULL”
[ 7.899] (==) NVIDIA(0): Disabling shared memory pixmaps
[ 7.899] (==) NVIDIA(0): Backing store enabled
[ 7.899] (==) NVIDIA(0): Silken mouse enabled
[ 7.899] (==) NVIDIA(0): DPMS enabled
[ 7.899] (II) Loading sub module “dri2”
[ 7.899] (II) LoadModule: “dri2”
[ 7.899] (II) Module “dri2” already built-in
[ 7.899] (II) NVIDIA(0): [DRI2] Setup complete
[ 7.899] (II) NVIDIA(0): [DRI2] VDPAU driver: nvidia
[ 7.900] (–) RandR disabled
[ 7.907] (II) SELinux: Disabled on system
[ 7.908] (II) Initializing extension GLX
[ 7.908] (II) Indirect GLX disabled.
[ 8.097] (II) config/udev: Adding input device tegra-hda HDMI/DP,pcm=3 (/dev/input/event0)
[ 8.097] (II) No input driver specified, ignoring this device.
[ 8.097] (II) This device may have been added with another device file.
[ 8.098] (II) config/udev: Adding input device gpio-keys (/dev/input/event1)
[ 8.098] (
) gpio-keys: Applying InputClass “libinput keyboard catchall”
[ 8.098] (II) LoadModule: “libinput”
[ 8.098] (II) Loading /usr/lib/xorg/modules/input/libinput_drv.so
[ 8.112] (II) Module libinput: vendor=“X .Org Foundation”
[ 8.112] compiled for 1.19.6, module version = 0.27.1
[ 8.112] Module class: X .Org XInput Driver
[ 8.112] ABI class: X .Org XInput driver, version 24.1
[ 8.113] (II) Using input driver ‘libinput’ for ‘gpio-keys’
[ 8.113] () gpio-keys: always reports core events
[ 8.113] (
) Option “Device” “/dev/input/event1”
[ 8.113] () Option “_source” “server/udev”
[ 8.114] (II) event1 - gpio-keys: is tagged by udev as: Keyboard
[ 8.114] (II) event1 - gpio-keys: device is a keyboard
[ 8.114] (II) event1 - gpio-keys: device removed
[ 8.136] (
) Option “config_info” “udev:/sys/devices/gpio-keys/input/input1/event1”
[ 8.136] (II) XINPUT: Adding extended input device “gpio-keys” (type: KEYBOARD, id 6)
[ 8.136] () Option “xkb_model” “pc105”
[ 8.136] (
) Option “xkb_layout” “us”
[ 8.137] (II) event1 - gpio-keys: is tagged by udev as: Keyboard
[ 8.137] (II) event1 - gpio-keys: device is a keyboard
[ 26.105] (II) config/udev: Adding input device iClever IC-BK05 Keyboard (/dev/input/event2)
[ 26.105] () iClever IC-BK05 Keyboard: Applying InputClass “libinput keyboard catchall”
[ 26.105] (II) Using input driver ‘libinput’ for ‘iClever IC-BK05 Keyboard’
[ 26.105] (
) iClever IC-BK05 Keyboard: always reports core events
[ 26.105] () Option “Device” “/dev/input/event2”
[ 26.105] (
) Option “_source” “server/udev”
[ 26.107] (II) event2 - iClever IC-BK05 Keyboard: is tagged by udev as: Keyboard
[ 26.107] (II) event2 - iClever IC-BK05 Keyboard: device is a keyboard
[ 26.107] (II) event2 - iClever IC-BK05 Keyboard: device removed
[ 26.132] () Option “config_info” “udev:/sys/devices/70090000.xusb/usb1/1-3/1-3.1/1-3.1:1.0/bluetooth/hci0/hci0:1/0005:04E8:7021.0001/input/input2/event2”
[ 26.132] (II) XINPUT: Adding extended input device “iClever IC-BK05 Keyboard” (type: KEYBOARD, id 7)
[ 26.132] (
) Option “xkb_model” “pc105”
[ 26.132] (**) Option “xkb_layout” “us”
[ 26.132] (WW) Option “xkb_variant” requires a string value
[ 26.132] (WW) Option “xkb_options” requires a string value
[ 26.134] (II) event2 - iClever IC-BK05 Keyboard: is tagged by udev as: Keyboard
[ 26.134] (II) event2 - iClever IC-BK05 Keyboard: device is a keyboard

Hi,

We want to reproduce this issue internally and check it closer.
Could you share the steps or source to reproduce the kernel time out error?

Thanks.

Hey,

So the code I’ve been running is just dummy code to see the benefits of running in the gpu. The code I’ll post below but all it does is count to a maxValue from 0.

I have setup a timer object with chronos and it terminates at 4.2 and 5.2 seconds pretty consistently. When I run the code with a reasonable maxValue it terminated with no problem.

I know this is probably not how you’re meant to be doing kernel operations but I see no reason why it should be giving me this error from the code side. As I understand it it has got to be from a property not allowing the gpu to do long task since I keep running deviceQuery and it keep notifying me that runtimeKernelLimit is on.

code:

__global__ void addingToMax(uint* d_in_out, uint maxValue){
    int id = threadIdx.x;

    while(d_in_out[id] != maxValue){
        d_in_out[id]++;
    }   
}

void printError(cudaError_t sys, int line){
    cout << "Error: " << cudaGetErrorString(sys) << endl
         << "       Line: " << line << endl;
}

int main(){
    cout << "Program Initiated." << endl << endl;

    myTimer watch;
    const uint numSize = 2;
    const size_t numSizeByte = numSize * sizeof(uint);
    uint maxValue = 0;
    maxValue--; //this is to overload the number and fall on the maxValue possible on the bitSize.
    int columnSize = 6; //For printing to console

    uint h_in_out[numSize]{};
    uint* d_in_out;
    cudaError_t sys;

    
    cout << "Starting:" << endl;
    watch.start();
    sys = cudaSetDevice(0);
    if(sys != cudaSuccess){
        printError(sys, 42);
        goto Terminated;
    }


    sys = cudaMalloc(&d_in_out, numSizeByte);    
    if(sys != cudaSuccess){
        printError(sys, 49);
        goto Terminated;
    }
    
    
    sys = cudaMemcpy(d_in_out, h_in_out, numSizeByte, cudaMemcpyHostToDevice);
    if(sys != cudaSuccess){
        printError(sys, 56);
        goto Terminated;
    }

    addingToMax<<<1, numSize>>>(d_in_out, maxValue);
    sys = cudaGetLastError();
    //sys = cudaSuccess;    //just noticed now that i left in this bit if code from when I was trying to debug. It was meant to just by pass the coming if for some reason that escapes me now. Left it in but shouldnt be here as part of the test code.
    if(sys != cudaSuccess){
        printError(sys, 63);
        goto Terminated;
    }

    sys = cudaMemcpy(h_in_out, d_in_out, numSizeByte, cudaMemcpyDeviceToHost);
    if(sys != cudaSuccess){
        printError(sys, 70);
        goto Terminated;
    }
    
Terminated:
    watch.stop();
    cout << endl;
    cout << "Count ended." << endl << endl;

    cout << "Printing values:" << endl;
    for(int i = 0; i < numSize;){
        cout << "    ";
        for(int j=0; j < columnSize && i < numSize; i++, j++){
            cout << setw(8) << h_in_out[i] << " | ";
        }
        cout << endl;
    }

    cout << "Duration: " << 
    printWithCommas(watch.durMicro());
 
    cout << endl << endl;
    cout << "Program Terminated." << endl;
    return 0;
}

Any updates?

Hi,

Could you check if this comment can solve your question:

Thanks.