Hello, I could observe a degradation at CUDA7.5 -> 8.0.
Once a process does cuInit(), then, its child processes forked after the cuInit() gets CUDA_ERROR_NOT_INITIALIZED error on own cuInit(). It never happen on the previous CUDA7.5, but CUDA8.0 always makes this error.
Somebody other have seen the similar problems?
Below is the code to reproduce:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <cuda.h>
#define elog(FORMAT,...) \
do { \
fprintf(stderr, FORMAT "\n", ##__VA_ARGS__); \
exit(1); \
} while(0)
static int child_proc(void)
{
CUdevice device;
CUresult rc;
rc = cuInit(0);
if (rc != CUDA_SUCCESS)
elog("pid=%u failed on cuInit: %ld", getpid(), (long)rc);
rc = cuDeviceGet(&device, 0);
if (rc != CUDA_SUCCESS)
elog("cuDeviceGet failed: %ld", (long)rc);
return 0;
}
int main(int argc, char *argv[])
{
CUresult rc;
pid_t child;
int status;
/* general initialization process */
rc = cuInit(0);
if (rc != CUDA_SUCCESS)
elog("parent: failed on cuInit: %ld", (long)rc);
/* connection accept, then fork a backend process */
child = fork();
if (child == 0)
return child_proc();
else if (child > 0)
wait(&status);
else
elog("failed on fork(2): %m");
return 0;
}
Execution example:
[kaigai@ayu ~]$ ./a.out
pid=10550 failed on cuInit: 3
It shows the cuInit() on the parent process get succeeded, but cuInit() on the child process gets failed.
It does not mean that child processes don’t need to call cuInit(), because the next cuDeviceGet() will fail even if I commented out the cuInit() on the child process side.
This kind of CUDA usage is very usual scenario on the server type software, and I could use the CUDA driver APIs at CUDA7.5. What is the reason of this mysterious behavior?
Software versions:
CUDA installation: 8.0.44 (Linux; runfile)
NVIDIA driver: 367.55