Hi, here are the results from repeating the kernel execution with and without gdm. The issue persists.
input: 600 MB, elapsed time: 72.2054 seconds, throughput = 8.30963 MB/s
> sudo systemctl stop gdm
input: 600 MB, elapsed time: 75.3617 seconds, throughput = 7.96161 MB/s
> sudo systemctl start gdm
input: 600 MB, elapsed time: 71.0216 seconds, throughput = 8.44813 MB/s
> sudo systemctl stop gdm
input: 600 MB, elapsed time: 75.928 seconds, throughput = 7.90222 MB/s
> sudo systemctl start gdm
input: 600 MB, elapsed time: 70.9346 seconds, throughput = 8.4585 MB/s
> sudo systemctl stop gdm
input: 600 MB, elapsed time: 76.5986 seconds, throughput = 7.83304 MB/s
> sudo systemctl start gdm
input: 600 MB, elapsed time: 71.1352 seconds, throughput = 8.43464 MB/s
> sudo systemctl stop gdm
input: 600 MB, elapsed time: 75.0668 seconds, throughput = 7.99288 MB/s
> sudo systemctl start gdm
input: 600 MB, elapsed time: 71.036 seconds, throughput = 8.44642 MB/s
> sudo systemctl stop gdm
input: 600 MB, elapsed time: 75.837 seconds, throughput = 7.91171 MB/s
Here are more details about the execution:
The program being tested is a GPU regex matching program with a single kernel. It is a memory-intensive code and it can saturate the GPU’s SM. The timing mechanism is as follows:
cudaEvent_t start, stop;
cudaEventCreate(&start);
cudaEventCreate(&stop);
cudaEventRecord(start, 0);
kernel<grid, block>(args);
cudaEventRecord(stop, 0);
cudaEventSynchronize(stop);
float milliseconds = 0;
cudaEventElapsedTime(&milliseconds, start, stop);
The observed behavior might seem counterintuitive, as I initially believed that terminating all processes consuming GPU resources (such as those related to the gdm) would be beneficial.