Hi all, I was using devtool sidecar 1.0.7 to inject nsys to my vllm container. When I specify my process name inside injectionMatch. It will make container exit with status 1 without any logs. Did I misconfig anything? Is devtool injector still up-to-date? Also, I’m curious about how does this injector works, how does it instrument my running process without any notable modification to process configs inside oci spec?
# Nsight Systems profiling configuration
profile:
# The arguments for the Nsight Systems. The placeholders will be replaced with the actual values.
devtoolArgs: "profile --start-later false -o /home/auto_{PROCESS_NAME}_%{POD_FULLNAME}_%{CONTAINER_NAME}_{TIMESTAMP}_{UID}.nsys-rep"
# The regex to match applications to profile.
injectionMatch: "^(?!.*nsys( |$)).*\\bvllm.*$"
I keep the same injectionMatch, and start the container with a infinity sleep. This time, I exec to the container and manually run the vllm. The behavior is the same, the process exit with status 1.
root@meta-deployment-8f6479bd8-sr29v:/tmp# python -m vllm.entrypoints.api_server --host=0.0.0.0 --port=7080 --swap-space=16 ......
root@meta-deployment-8f6479bd8-sr29v:/tmp# echo $?
1
# Found a file was created under /tmp
root@meta-deployment-8f6479bd8-sr29v:/tmp# cat devtool-injection-k8s_auto__c4893ea6
PROCESS_ID=1004
PROCESS_NAME=python
#EOF
Update: I thought it was a problem with vllm, so I simply try command that contains “vllm” but not supposed to invoke GPU, such as “cat vllm”
I found from strace log that exec does works, and following injector library loaded. But after that, dynamic loading looks was messed up. I’m not an expert, but attach the log for nvidia expert to interpret. nsight_debug_output_cat_vllm.log (420.7 KB)
[pid 1705954] openat(AT_FDCWD, "/mnt/nv/bin/libPreRunProcessInjector.so", O_RDONLY|O_CLOEXEC) = 3