dkms problem while installing CUDA 8 on RHEL 7.3 ppc64le

Hello,

I tried to install CUDA 8 on RHEL 7.3 ppc64le (POWER8), but encountered the following problem. Any tips ?
I searched this forum and found a suggestion to install dkms, but dkms does not seem available on ppc64le.

[root@sys-84907 files]# yum install cuda.ppc64le
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Resolving Dependencies
→ Running transaction check
—> Package cuda.ppc64le 0:8.0.54-1 will be installed
→ Processing Dependency: cuda-8-0 >= 8.0.54 for package: cuda-8.0.54-1.ppc64le
→ Running transaction check
—> Package cuda-8-0.ppc64le 0:8.0.54-1 will be installed
→ Processing Dependency: cuda-toolkit-8-0 >= 8.0.54 for package: cuda-8-0-8.0.54-1.ppc64le

→ Running transaction check
—> Package nvidia-kmod.ppc64le 1:361.107-2.el7 will be installed
→ Processing Dependency: dkms for package: 1:nvidia-kmod-361.107-2.el7.ppc64le
—> Package python-javapackages.noarch 0:3.4.1-11.el7 will be installed
→ Finished Dependency Resolution
Error: Package: 1:nvidia-kmod-361.107-2.el7.ppc64le (cuda-8-0-local-ga2)
Requires: dkms

You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest

Any comment would be appreciated.

Partially solved. I found a dkms rpm for Fedora Rawhid ppc64le from rpmfind.net… I observe a non-fatal error concerning to dkms while installing cuda as below. Still, it seems to work !

wget ftp://fr2.rpmfind.net/linux/fedora-secondary/development/rawhide/Everything/ppc64le/os/Packages/d/dkms-2.3-1.20161202gitde1dca9.fc26.noarch.rpm

rpm -Uvh dkms-2.3-1.20161202gitde1dca9.fc26.noarch.rpm

yum install cuda

→ Now works, but with the following non-fatal error.


Installing : 1:nvidia-kmod-361.107-2.el7.ppc64le 26/41

Creating symlink /var/lib/dkms/nvidia/361.107/source →
/usr/src/nvidia-361.107

DKMS: add completed.
Error! echo
Your kernel headers for kernel 3.10.0-514.el7.ppc64le cannot be found at
/lib/modules/3.10.0-514.el7.ppc64le/build or /lib/modules/3.10.0-514.el7.ppc64le/source.
Error! echo
Your kernel headers for kernel 3.10.0-514.el7.ppc64le cannot be found at
/lib/modules/3.10.0-514.el7.ppc64le/build or /lib/modules/3.10.0-514.el7.ppc64le/source.
warning: %post(nvidia-kmod-1:361.107-2.el7.ppc64le) scriptlet failed, exit status 1
Non-fatal POSTIN scriptlet failure in rpm package 1:nvidia-kmod-361.107-2.el7.ppc64le
Installing : 1:xorg-x11-drv-nvidia-361.107-1.el7.ppc64le 27/41

Installed:
cuda.ppc64le 0:8.0.54-1

Dependency Installed:
cuda-8-0.ppc64le 0:8.0.54-1 cuda-command-line-tools-8-0.ppc64le 0:8.0.54-1
cuda-core-8-0.ppc64le 0:8.0.54-1 cuda-cublas-8-0.ppc64le 0:8.0.54-1
cuda-cublas-dev-8-0.ppc64le 0:8.0.54-1 cuda-cudart-8-0.ppc64le 0:8.0.54-1
cuda-cudart-dev-8-0.ppc64le 0:8.0.54-1 cuda-cufft-8-0.ppc64le 0:8.0.54-1
cuda-cufft-dev-8-0.ppc64le 0:8.0.54-1 cuda-curand-8-0.ppc64le 0:8.0.54-1
cuda-curand-dev-8-0.ppc64le 0:8.0.54-1 cuda-cusolver-8-0.ppc64le 0:8.0.54-1
cuda-cusolver-dev-8-0.ppc64le 0:8.0.54-1 cuda-cusparse-8-0.ppc64le 0:8.0.54-1
cuda-cusparse-dev-8-0.ppc64le 0:8.0.54-1 cuda-demo-suite-8-0.ppc64le 0:8.0.54-1
cuda-documentation-8-0.ppc64le 0:8.0.54-1 cuda-driver-dev-8-0.ppc64le 0:8.0.54-1
cuda-drivers.ppc64le 0:361.107-1 cuda-license-8-0.ppc64le 0:8.0.54-1
cuda-misc-headers-8-0.ppc64le 0:8.0.54-1 cuda-npp-8-0.ppc64le 0:8.0.54-1
cuda-npp-dev-8-0.ppc64le 0:8.0.54-1 cuda-nvgraph-8-0.ppc64le 0:8.0.54-1
cuda-nvgraph-dev-8-0.ppc64le 0:8.0.54-1 cuda-nvml-dev-8-0.ppc64le 0:8.0.54-1
cuda-nvrtc-8-0.ppc64le 0:8.0.54-1 cuda-nvrtc-dev-8-0.ppc64le 0:8.0.54-1
cuda-runtime-8-0.ppc64le 0:8.0.54-1 cuda-samples-8-0.ppc64le 0:8.0.54-1
cuda-toolkit-8-0.ppc64le 0:8.0.54-1 cuda-visual-tools-8-0.ppc64le 0:8.0.54-1
nvidia-kmod.ppc64le 1:361.107-2.el7 xorg-x11-drv-nvidia.ppc64le 1:361.107-1.el7
xorg-x11-drv-nvidia-devel.ppc64le 1:361.107-1.el7 xorg-x11-drv-nvidia-libs.ppc64le 1:361.107-1.el7

Complete!

Use the dkms from EPEL.

yum install epel-release
yum install dkms

download the rpm from the NVidia website

rpm -i cuda-repo-rhel7-8-0-local-ga2-8.0.54-1.ppc64le.rpm
yum clean expire-cache
yum install cuda

e.g.:

https://sthbrx.github.io/blog/2017/02/01/namd-on-nvlink/

Your current error seems to be a the package kernel-devel not currently installed.

1 Like