Failed to execute MI command

Hi, all

I am trying to run nsight on host PC for xavier cuda profiling.

However, when I ran debug, I got the following error message.

Error in final launch sequence
Failed to execute MI command:
-target-select remote 143.248.139.94:2345
Error message from debugger back end:
Can't kill process
Can't kill process

What is the meaning of this error message? How can I handle this error?

======= Debug configuration ========
Remote executable: /usr/src/tensorrt/bin/trtexec
Debugger port: 2345
Program arguments: --onnx=/usr/src/tensorrt/model/alexnet.onnx
CUDA GDB executable: cuda-gdb
CUDA GDB init file: .cuda-gdbinit
Using CUDA-GDB Remote Debugging Launcher
The version of cuda-gdb on xavier:

nvidia@xavier:~$ cuda-gdb
NVIDIA (R) CUDA Debugger
10.2 release
Portions Copyright (C) 2007-2019 NVIDIA Corporation
GNU gdb (GDB) 7.12

The version of cuda-gdb on host PC:

NVIDIA (R) CUDA Debugger
9.1 release
Portions Copyright (C) 2007-2017 NVIDIA Corporation
GNU gdb (GDB) 7.12

I used JetPack4.3 for installation.

Thanks!

Hi,

Do you link Xavier with root account?
Please noticed that you will need to login as root to get the authority for CUDA profiling.

A detail steps to achieve this can be found here:

Thanks.

1 Like

Thank you for the reply.

I solved the error by permitting root login.

vi /etc/ssh/sshd_config
edit the line '#PermitRootLogin prohibit-password'
to 'PermitRootLogin yes'

However, I again faced another error message.

CUDA Device Not Recognized
One or more CUDA devices were not recognized. As a result, metrics and events may be unavailable for those devices.
CUDA IDE
Timed out query for the list of devices. It is likely hat the device is suspended on a breakpoint. Do you want to retry detecting the devices?

I checked that it is not problem to run a remote application, but debug remote application doesn’t work, showing above error message.

How can I handle this problem?

Any help will be very appreciated.

Thanks in advance.

Regards.

=========debug configuration=========
remote connection: root@xxx.xxx.xxx.xx
remote executable: /usr/src/tensorrt/bin/trtexec_debug
debugger port: 2345
program arguments: --deploy=/usr/src/tensorrt/data/caffe_model/alexnet/deploy.prototxt --output=prob --useDLACore=0
Toolkit path: /usr/local/cuda-10.2/bin
CUDA GDB executable: cuda-gdb
CUDA GDB init file: .cuda-gdbinit

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi,

We will need to reproduce this issue first before giving a further suggestion.
Would you mind to share a detail steps to reproduce this with us?

Thanks.