CUDA does not work for non-root users

Environment:

OS: CentOS Linux release 7.5.1804 (Core)
DRIVER: NVIDIA-Linux-x86_64-410.93.run
CUDA: 10.1 (installed by yum: yum install cuda)
GPU: Quadro P4000
SELinux: Disabled

I installed CUDA and root can run examples made by cuda-install-samples-10.1.sh.
But some examples do not work for non-root users.

Non-root users can compile examples, and also can run 1_Utilities/deviceQuery correctly.
But 1_Utilities/bandwidthTest does not work:

$ ./bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: Quadro P4000
 Quick Mode

It always stops here. If root runs it, it correctly finishes:

# ./bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: Quadro P4000
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(GB/s)
   32000000                     12.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(GB/s)
   32000000                     13.2

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(GB/s)
   32000000                     195.2

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Some topics mentioned that permissions of /dev/nvidia*, but they seem OK.

$ ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195,   0 Mar  7 15:06 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Mar  7 15:06 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Mar  7 15:06 /dev/nvidia-modeset
crw-rw-rw- 1 root root 236,   0 Mar  7 15:06 /dev/nvidia-uvm
crw-rw-rw- 1 root root 236,   1 Mar  7 15:06 /dev/nvidia-uvm-tools

Any idea to solve it?

Probably some Selinux settings. Try change it permissive or something.

Oops forget it, just noticed that you have had it disabled already, sorry about that.

This was fixed.

The problem was that our system has ulimit of 8000000 on virtual memory.
It seems CUDA cannot run if ulimit of virtual memory is set as such a small number,
although it can run even if there are only small memory and virtual memory (i.e. less than 8GB) in the system.

After removing this limit, non-root user can run CUDA programs correctly.