Profile multithread application with nsys

Hi. I’m interested in profiling with nsys using CLI a multithread application. I want to know time spent on GPU (host to device + kernel execution + device to host). There is a shell involved in the app execution, that asks the user for some input, but after calling the commands:

nsys profile --stats=true --trace-fork-before-exec=true ./app

the shell asking for user input, usually automatically showed, never pops up. Instead “^[[44;1R” is printed on screen. Could the problem be that it is handled by another thread that nsys cannot track? So I’m wondering if the options that I’m using are correct to track a multithread application or if I am doing something wrong. Thanks in advance.

@afroger , can you respond to this?

Hi carbonara,

I apologize for the long delay, I didn’t notice the notification.

I don’t seem to have a problem profiling the following bash script:

$ cat /tmp/writeback.sh
#!/usr/bin/env bash

while true
do
    read -p "Input: " line
    echo "Output: $line"
done

$ nsys profile --stats=true --trace-fork-before-exec=true /tmp/writeback.sh
Input: abc
Output: abc
Input: def
Output: def
Input: ^C^C

We do have some logic to demultiplex the output and error streams but we don’t touch the input stream. This might be a legitimate bug in the tool. @carbonara Do you by change have a small bash script you could share that reproduce the issue?