Slow OpenCL startup on Linux

Hello,

I am experience a very slow startup using OpenCL on Linux (Ubuntu 15.04). My GPU is a Tesla C 2075 and I installed the latest available driver. Here are some time measurements for the API calls.

clGetPlatformIDs 2306ms
clGetPlatformIDs 0ms
clGetDeviceIDs 0ms
clCreateContext 62ms
clCreateCommandQueue 0ms
clGetPlatformInfo 0ms
clGetPlatformInfo 0ms
clGetPlatformInfo 0ms
clCreateProgramWithSource 0ms
clCreateBuffer 0ms
clEnqueueWriteBuffer 1ms
clSetKernelArg 0ms
clCreateBuffer 0ms
clEnqueueWriteBuffer 1ms
clSetKernelArg 0ms
clCreateBuffer 0ms
clSetKernelArg 0ms
clEnqueueNDRangeKernel 0ms
clEnqueuMarker 0ms
clSetEvenetCallback 0ms
clReleaseEvent 0ms
clReleaseEvent 0ms
clReleaseEvent 0ms
clReleaseEvent 0ms
clReleaseEvent 0ms

In the past, the OpenCL overhead never carried weight. Any idea where I went wrong?

Thanks
Raphael

The first call that uses the GPU is usually a long one. It forces the GPU runtime to complete its initialization, so very often the first call will experience a long delay.

You might be able to improve things somewhat by setting persistence mode on the GPU using nvidia-smi

Thanks for the tipp!

I tried enabling persistence (nvidia-smi -i 0 -pm ENABLED), but although the command returns with success, the persistence mode is still disabled (nvidia-smi -i 0 -q). The alternative seems to use the persistence demon, which I can start, but it doesn’t run afterwards (ps aux).

While this may indeed be the problem, I have no clue how solve it (short of installing X and reinstalling the drivers).

I have a tesla C2075 albeit running on CentOS 6.2 not Ubuntu.

I had no trouble setting the persistence mode, although you have to have root privilege. Did you try it as root?

After I did:

sudo nvidia-smi -i 0 -pm 1

This is what I saw:

$ nvidia-smi
Fri Sep  4 07:41:45 2015
+------------------------------------------------------+
| NVIDIA-SMI 346.46     Driver Version: 346.46         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla C2075         On   | 0000:03:00.0     Off |                    0 |
| 30%   56C    P1     0W / 225W |      9MiB /  5375MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  NVS 310             Off  | 0000:04:00.0     N/A |                  N/A |
| 30%   41C    P0    N/A /  N/A |      3MiB /   511MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K40c          Off  | 0000:82:00.0     Off |                    0 |
| 23%   36C    P0    65W / 235W |     23MiB / 11519MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    1              C   Not Supported                                         |
+-----------------------------------------------------------------------------+

Yes, I did that as admin. Do you have X installed?

root@computer:~# nvidia-smi -i 0 -pm 1
Enabled persistence mode for GPU 0000:03:00.0.
All done.
root@computer:~# nvidia-smi
Sun Sep  6 16:05:31 2015       
+------------------------------------------------------+                       
| NVIDIA-SMI 346.59     Driver Version: 346.59         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla C2075         Off  | 0000:03:00.0     Off |                    0 |
| 30%   39C    P0     0W / 225W |      9MiB /  5375MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

I’m not running X on that particular node.

What is the result of the following command, run as root, on your machine:

dmesg |grep NVRM

?

That does not return a lot:

root@computer:~# dimes | grep NVRM
[    9.373540] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  346.59  Tue Mar 31 14:10:31 PDT 2015
root@computer:~#

Did you remove the nouveau driver from your machine?

If not, try the following, as root:

echo -e "blacklist nouveau\noptions nouveau modeset=0"  > /etc/modprobe.d/disable-nouveau.conf
update-initramfs -u

then reboot, then repeat the persistence mode attempt

If that doesn’t help, I’m stumped.

It doesn’t show up in lsmod, so I guess it is disabled. Installing the downloaded nvidia driver also requires disabling nouveau. The installer offers to create a blacklist file with the content you suggested.

Thanks for you advice! If I stumble upon a solution, I will post it here for future reference.

Found a solution. I think, I used the drivers from the Ubuntu repositories before. Switching to the driver from the nvidia website made it work. The version is a bit different, maybe that did it.

$ sudo apt-get remove nvidia-opencl-dev nvidia-opencl-icd-346 nvidia-346 nvidia-opencl-icd-346 nvidia-346-uvm nvidia-modprobe nvidia-opencl-dev
$ sudo ./NVIDIA-Linux-x86_64-346.89.run
$ sudo nvidia-smi -i 0 -pm ENABLED
Enabled persistence mode for GPU 0000:03:00.0.
All done.
$ sudo nvidia-smi
Mon Sep  7 00:27:31 2015       
+------------------------------------------------------+                       
| NVIDIA-SMI 346.89     Driver Version: 346.89         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla C2075         On   | 0000:03:00.0     Off |                    0 |
| 30%   40C   P12    26W / 225W |     10MiB /  5375MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+