Continuing the discussion from Standard nVidia CUDA tests fail with dual RTX 4090 Linux box:
We have the same issue using 2x 4090 with Driver Version: 525.85.12 CUDA Version: 12.0.
First noticed when running distribution test with Tensorflow
Both tests reported above fail in the same way
(see NVIDIA/cuda-samples.git at github…argggh…only 1 link for new users)
- Samples/0_Introduction/simpleP2P - Test failed!
- Samples/0_Introduction/simpleIPC - Verification mismatch at 0: 1 != 0…
# Machine:
# https://rog.asus.com/nl/motherboards/rog-zenith/rog-zenith-ii-extreme-alpha-model/
$ uname -a
Linux senor0lunlx0163 5.15.0-60-generic #66~20.04.1-Ubuntu SMP Wed Jan 25 09:41:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.5 LTS
Release: 20.04
Codename: focal
$ nvidia-smi
Fri Feb 10 16:12:49 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | Off |
| 0% 51C P8 19W / 450W | 232MiB / 24564MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:23:00.0 Off | Off |
| 0% 51C P8 25W / 450W | 10MiB / 24564MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2060 G /usr/lib/xorg/Xorg 35MiB |
| 0 N/A N/A 12762 G /usr/lib/xorg/Xorg 98MiB |
| 0 N/A N/A 12979 G /usr/bin/gnome-shell 63MiB |
| 0 N/A N/A 14647 G ...106669053894826474,131072 16MiB |
| 1 N/A N/A 2060 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 12762 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+
cat motherboard.info
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.2.0 present.
Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
Manufacturer: ASUSTeK COMPUTER INC.
Product Name: ROG ZENITH II EXTREME ALPHA
Version: Rev 1.xx
Serial Number: 210788043200148
Asset Tag: Default string
Features:
Board is a hosting board
Board is removable
Board is replaceable
Location In Chassis: Default string
Chassis Handle: 0x0003
Type: Motherboard
Contained Object Handles: 0
Handle 0x003C, DMI type 10, 6 bytes
On Board Device Information
Type: Video
Status: Enabled
Description: To Be Filled By O.E.M.
Handle 0x0042, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Onboard IGD
Type: Video
Status: Enabled
Type Instance: 1
Bus Address: 0000:00:02.0
Handle 0x0043, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Onboard LAN
Type: Ethernet
Status: Enabled
Type Instance: 1
Bus Address: 0000:00:19.0
Handle 0x0044, DMI type 41, 11 bytes
Onboard Device
Reference Designation: Onboard 1394
Type: Other
Status: Enabled
Type Instance: 1
Bus Address: 0000:03:1c.2