Profiling with multiple users

Hi!
We recently stumbled upon a problem when profiling with different users on the same machine. Nsight Systems uses the same directory in /tmp for all users which leads to ownership problems if different users want to profile their applications. Our workaround for this is to use “chown -R /tmp/nvidia” or to delete the directory before profiling. I think it would be better to have a separate directory for each user or to provide a possibility to make the tmp directory user-configurable.
Thank you for your attention!
Robin

Setup:
Host:
Windows 10 1909
Nsight Systems 2020.2.1
Target:
Ubuntu 18.04.4 LTS
CUDA 10.2

I will have to check with other engineers, I was under the impression that that had been replaced with a /tmp/[uid]. We will also be adding the ability to explicitly set temporary directory on the command line.

But that being said, you can use the tmpdir environment variable and Nsys will respect it.

Unfortunately Nsight Systems does not respect the tmpdir environment variable when profiling remote targets. Neither setting TMPDIR in the project settings nor exporting TMPDIR in .bashrc/.bash_profile seems to help.
This is the error shown in Nsight Systems where it still uses the default /tmp directory:

AnalysisService: failed to start event source Trace: /home/devtools/teamCityBuildAgent/work/20a3cfcd1c25021d/QuadD/Target/quadd_d/quadd_d/jni/EventSource/Trace.cpp(277): Throw in function void {anonymous}::CheckPermissions(const boost::filesystem::path&)
Dynamic exception type: boost::wrapexcept
std::exception::what: NoPermissionException
[QuadDCommon::tag_file_name*] = /tmp/nvidia/nsight_systems/injection_files
[QuadDCommon::tag_error_text*] = Insufficient permissions for injection data directory

Hopefully the changes you mentioned will be available with the next release.

We’ve found that Nsight Systems 2020.3.2 does not respect TMPDIR either. Hoping it is respected in a more recent version @hwilper.
Also, the above post of @rkobus actually uncovers another issue with checking for directory/file permissions. It appears that the code that creates the /tmp/nvidia/nsight_systems/injection_files directory also sets certain permissions that include one or more world bits. On our systems, enabling any world permissions are not permitted by security requirements and are squashed by a system-wide user-level umask. Therefore, it seems to consistently fail this permissions test.

I am surprised to hear that TMPDIR isn’t working for you. I’ll see if we can get the world bits issue resolved.
@liuyis can you please look into the permissions issue?

It appears that the code that creates the /tmp/nvidia/nsight_systems/injection_files directory also sets certain permissions that include one or more world bits.

That’s correct. The /tmp/nvidia and /tmp/nvidia/nsight_systems directories are shared by all users in both GUI and CLI collections; the /tmp/nvidia/nsight_systems/injection_files directory is also shared by all users in GUI collections. So we set 777 permission to make sure every user has access.

On our systems, enabling any world permissions are not permitted by security requirements and are squashed by a system-wide user-level umask. Therefore, it seems to consistently fail this permissions test.

That’s an unexpected scenario for us, we will need to figure our how to support it. What is the specific umask on your systems? What is the resulted permission if you manually execute chmod 777 for a directory?

BTW, we’ve had several releases after 2020.3.2. It would be good if you could try our latest 2021.1 release: https://developer.nvidia.com/gameworksdownload#?dn=nsight-systems-2021-1 and see what happens.

The umask on all of the nodes of our clusters is 0007, to comply with security policy. I think the most effective way for us would be to honor the $TMPDIR environment variable, as Nsight Compute does.

Thanks for the feedback, we will look into it

Hi @rkobus @areuther1, I would like to give an update for this issue. We have made an improvement to make sure Nsight Systems honor TMPDIR. It will be available in our next release (2021.3), please try it when that’s out and feel free to share if there is any further question. Thanks!

1 Like

Thank you for getting this fix into the latest version, @liuyis . We are looking forward to downloading and using it.