The log from the nvidia-bug-report.sh & the installation log is attached to support reference 220710-000470.
I can post them here too if you like but it’s quite some log so… probably just cluttering the post.
My intention is to use the CUDNN backend for OpenCV. OpenCV compiles without any problems and works up to the point where I try to use the CUDA backend.
OpenCV throws the following error at me which is in my opinion a symptom of a lower level issue (also visible from the issues stated above)
terminate called after throwing an instance of 'cv::dnn::cuda4dnn::csl::CUDAException'
what(): OpenCV(4.6.0-dev) /home/lrmts/Downloads/OpenCV/opencv-4.x/modules/dnn/src/cuda4dnn/csl/memory.hpp:54: error: (-217:Gpu API call) unknown error in function 'ManagedPtr'
any input on where the problem might be is highly appreciated.
nvidia-uvm isn’t loaded. Please put it in the list of modules to load on boot or install nvidia-modprobe so normal users can load it or run deviceQuery once as root to load it.
$ sudo modinfo nvidia-uvm
[sudo] password for lrmts:
filename: /lib/modules/5.15.0-41-generic/updates/dkms/nvidia-uvm.ko
supported: external
license: Dual MIT/GPL
srcversion: 47ABA39EF6732B7F0C672A2
depends: nvidia
retpoline: Y
name: nvidia_uvm
vermagic: 5.15.0-41-generic SMP mod_unload modversions
sig_id: PKCS#7
signer: ubuntu Secure Boot Module Signature key
sig_key: 7B:90:F6:84:8E:3F:B4:11:FA:44:80:25:D8:10:52:9C:D3:46:4A:1A
sig_hashalgo: sha512
signature: 19:03:54:BD:61:2A:66:5A:DD:05:0B:07:83:F8:E4:9D:A0:78:F3:C6:
6E:AE:B3:23:8C:37:BA:3A:AE:D0:02:C1:A7:40:53:B4:F3:F7:A1:50:
E4:6B:A0:FC:EE:21:80:65:82:90:6B:B9:DE:08:0F:F0:57:B4:E1:A2:
B8:A7:CE:83:E9:57:DF:F8:5E:CB:D9:B8:7D:18:2F:45:99:FF:B3:F2:
40:E4:80:F5:F9:55:E6:A6:44:44:13:1F:CC:27:E3:3C:8E:A3:3A:11:
76:39:FC:4F:CB:F8:BC:EC:12:61:3F:5F:9A:F8:29:B5:62:E4:91:C6:
9E:8A:58:30:C4:D5:AE:FE:E5:71:3C:7F:3B:8C:A1:9D:A5:6C:1E:D6:
AA:35:08:10:B7:4F:D1:3F:E6:0A:DC:B9:27:F9:23:86:5C:93:FD:45:
C8:6E:6D:5C:8E:8D:67:61:BA:FA:F9:93:6D:2D:EA:DD:DA:15:B6:0C:
2C:75:28:F3:57:94:87:32:B0:43:D0:9A:0B:71:63:6C:94:62:38:D6:
7B:0B:88:69:9B:DE:79:41:1C:EC:B8:B1:27:52:2B:AB:7B:41:7D:FF:
EA:EF:34:68:22:32:CF:49:CF:F8:70:11:70:FE:2B:58:26:AA:49:21:
F7:08:21:A5:37:DE:7B:D8:D2:31:0A:9E:7B:4C:3E:EE
parm: uvm_ats_mode:Set to 0 to disable ATS (Address Translation Services). Any other value is ignored. Has no effect unless the platform supports ATS. (int)
parm: uvm_perf_prefetch_enable:uint
parm: uvm_perf_prefetch_threshold:uint
parm: uvm_perf_prefetch_min_faults:uint
parm: uvm_perf_thrashing_enable:uint
parm: uvm_perf_thrashing_threshold:uint
parm: uvm_perf_thrashing_pin_threshold:uint
parm: uvm_perf_thrashing_lapse_usec:uint
parm: uvm_perf_thrashing_nap:uint
parm: uvm_perf_thrashing_epoch:uint
parm: uvm_perf_thrashing_pin:uint
parm: uvm_perf_thrashing_max_resets:uint
parm: uvm_perf_map_remote_on_native_atomics_fault:uint
parm: uvm_disable_hmm:Force-disable HMM functionality in the UVM driver. Default: false (i.e, HMM is potentially enabled). Ignored if HMM is not supported in the driver, or if ATS settings conflict with HMM. (bool)
parm: uvm_perf_migrate_cpu_preunmap_enable:int
parm: uvm_perf_migrate_cpu_preunmap_block_order:uint
parm: uvm_global_oversubscription:Enable (1) or disable (0) global oversubscription support. (int)
parm: uvm_perf_pma_batch_nonpinned_order:uint
parm: uvm_cpu_chunk_allocation_sizes:OR'ed value of all CPU chunk allocation sizes. (uint)
parm: uvm_leak_checker:Enable uvm memory leak checking. 0 = disabled, 1 = count total bytes allocated and freed, 2 = per-allocation origin tracking. (int)
parm: uvm_force_prefetch_fault_support:uint
parm: uvm_debug_enable_push_desc:Enable push description tracking (uint)
parm: uvm_debug_enable_push_acquire_info:Enable push acquire information tracking (uint)
parm: uvm_page_table_location:Set the location for UVM-allocated page tables. Choices are: vid, sys. (charp)
parm: uvm_perf_access_counter_mimc_migration_enable:Whether MIMC access counters will trigger migrations.Valid values: <= -1 (default policy), 0 (off), >= 1 (on) (int)
parm: uvm_perf_access_counter_momc_migration_enable:Whether MOMC access counters will trigger migrations.Valid values: <= -1 (default policy), 0 (off), >= 1 (on) (int)
parm: uvm_perf_access_counter_batch_count:uint
parm: uvm_perf_access_counter_granularity:Size of the physical memory region tracked by each counter. Valid values asof Volta: 64k, 2m, 16m, 16g (charp)
parm: uvm_perf_access_counter_threshold:Number of remote accesses on a region required to trigger a notification.Valid values: [1, 65535] (uint)
parm: uvm_perf_reenable_prefetch_faults_lapse_msec:uint
parm: uvm_perf_fault_batch_count:uint
parm: uvm_perf_fault_replay_policy:uint
parm: uvm_perf_fault_replay_update_put_ratio:uint
parm: uvm_perf_fault_max_batches_per_service:uint
parm: uvm_perf_fault_max_throttle_per_service:uint
parm: uvm_perf_fault_coalesce:uint
parm: uvm_fault_force_sysmem:Force (1) using sysmem storage for pages that faulted. Default: 0. (int)
parm: uvm_perf_map_remote_on_eviction:int
parm: uvm_exp_gpu_cache_peermem:Force caching for mappings to peer memory. This is an experimental parameter that may cause correctness issues if used. (uint)
parm: uvm_exp_gpu_cache_sysmem:Force caching for mappings to system memory. This is an experimental parameter that may cause correctness issues if used. (uint)
parm: uvm_channel_num_gpfifo_entries:uint
parm: uvm_channel_gpfifo_loc:charp
parm: uvm_channel_gpput_loc:charp
parm: uvm_channel_pushbuffer_loc:charp
parm: uvm_enable_va_space_mm:Set to 0 to disable UVM from using mmu_notifiers to create an association between a UVM VA space and a process. This will also disable pageable memory access via either ATS or HMM. (int)
parm: uvm_enable_debug_procfs:Enable debug procfs entries in /proc/driver/nvidia-uvm (int)
parm: uvm_peer_copy:Choose the addressing mode for peer copying, options: phys [default] or virt. Valid for Ampere+ GPUs. (charp)
parm: uvm_debug_prints:Enable uvm debug prints. (int)
parm: uvm_enable_builtin_tests:Enable the UVM built-in tests. (This is a security risk) (int)
and
$ sudo modprobe nvidia-uvm
modprobe: ERROR: could not insert 'nvidia_uvm': Operation not permitted
looks like a permission issue to me so. I could understand if I would run this as non-root user but like that I don’t really understand…
I’m also a bit puzzled, if the signing key was invalid, then the other nvidia modules shouldn’t load as well. Please check modinfo nvidia and compare the key fingerprints to make sure the same key was used.
The modules are auto-signed by dkms with the key created when Ubuntu was initially installed. So nothing for you to do wrong. Maybe rather report this to the Ubuntu bug tracker, I can’t really think of a reason for the uvm module being invalid. I’d expect if modinfo displays the key, the keys are the same, it should work.