Rivermax SDK example code run failed

Hi expert

I am using rivermax sdk example codes, but get cuMemSetAccess failed

CODE: media_sender.exe(Release-CUDA)
GPU: NVIDIA RTX A4000
CUDA: 12.1
Driver: 531.14
wiindows: 10 rpo 21H2

RUN CMD: .\media_sender.exe -c 1 -a 2 -s .\sdps_samples\sdp_2110-20_narrow_gap_1080p50fps.txt -g 0 -r --max_gpu_freq

/logs as follows*****************/

#############################################

Rivermax SDK version: 1.30.16

Media sender version: 1.30.16

#############################################
Set env variable CUDA_DEVICE_ORDER=PCI_BUS_ID
gpu_device_id = 0
Writing log to default location: C:\Users\Administrator\AppData\Local\Temp\rivermax_0605_095514_5904.log
Created log file: C:\Users\Administrator\AppData\Local\Temp\rivermax_0605_095514_5904.log
[23-06-05 09:55:14.856135] Tid: 003972 info [InitLogger:92] Logger started
[23-06-05 09:55:14.856164] Tid: 003972 info [rmax_init:610] starting Rivermax: SDK version 1.30.16
[23-06-05 09:55:14.857025] Tid: 003972 debug [Clock:31]
[23-06-05 09:55:14.857044] Tid: 003972 debug [SysClock:41]
[23-06-05 09:55:14.857062] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_DISABLE_VIDEO_GROUPING to the value true
[23-06-05 09:55:14.857085] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_VIDEO_PACE_INTERVAL to the value 1000000
[23-06-05 09:55:14.857287] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_OUT_STREAM_SIZE_IN_PKTS to the value 32768
[23-06-05 09:55:14.857309] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_HEADER_STRIDE_SIZE to the value 64
[23-06-05 09:55:14.857329] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_DISABLE_FLOW_ID to the value false
[23-06-05 09:55:14.857348] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_SDP_PARSER_ENABLE_LOGGING to the value true
[23-06-05 09:55:14.857368] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_PTP_HW_RT_CLOCK to the value false
[23-06-05 09:55:14.857503] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_CUDA to the value true
[23-06-05 09:55:14.857526] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_STATISTICS to the value false
[23-06-05 09:55:14.857547] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_API_VERIFICATION to the value false
[23-06-05 09:55:14.857567] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_DISABLE_AUDIO_BUFFERING to the value false
[23-06-05 09:55:14.857785] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_SESSION_MAP_SIZE to the value 2000
[23-06-05 09:55:14.857819] Tid: 003972 debug [rivermax_get_user_env:151] parsed env RIVERMAX_SESSION_MAP_SIZE to the value 2000
[23-06-05 09:55:14.858010] Tid: 003972 debug [EventHandlerManager:125]
[23-06-05 09:55:14.858224] Tid: 003972 info [EventHandlerManager:132] will wakeup before frame begin event in 2000000 ns
[23-06-05 09:55:14.858244] Tid: 003972 debug [EventHandlerManagerHigh:259]
[23-06-05 09:55:14.858262] Tid: 003972 debug [start_thread:341] Starting internal thread
[23-06-05 09:55:14.858301] Tid: 003972 debug [rivermax_set_thread_affinity:719] successfully set thread affinity using cpu mask: 0x2, previous mask: 0xff
[23-06-05 09:55:14.858329] Tid: 003972 debug [start_thread:344] Started event handler thread
[23-06-05 09:55:14.858355] Tid: 003972 debug [init_globals:249] Time now is 1685930114858355500
[23-06-05 09:55:14.858947] Tid: 001968 info [print_thread_info:117] High priority internal thread: PID = 5904, thread ID = 1968
[23-06-05 09:55:14.859747] Tid: 003972 debug [load_provider:69] dpcp[0] = 6008005000000 ‘Mellanox ConnectX-6 Dx Adapter’
[23-06-05 09:55:14.859959] Tid: 003972 debug [load_provider:69] dpcp[1] = 6008006000000 ‘Mellanox ConnectX-6 Dx Adapter #2
[23-06-05 09:55:14.860177] Tid: 003972 info [init:37] DPCP/DevX provider was loaded
[23-06-05 09:55:14.919350] Tid: 003972 debug [getAdapterInfo:473] Adapter 以太网 2 vlanId 0 len 6 MAC 04:3f:72:a4:99:94
[23-06-05 09:55:14.925612] Tid: 003972 debug [getAdapterInfo:511] LUID 6008005000000 0x15b3/0x101d dpcp_adapter 0x1d173dc5cf0 opened true ret 0
[23-06-05 09:55:14.925961] Tid: 003972 debug [getAdapterInfo:528] IP: 192.168.5.114 VLAN_ID: 0 Serial number: MT2035X03236
[23-06-05 09:55:14.926026] Tid: 003972 debug [getAdapterInfo:530] MTU: 1500 TXlinkSpeed: 100 Gbps RXLinkSpeed:100 Gbps
[23-06-05 09:55:14.926326] Tid: 003972 info [getAdapterInfo:535] Device with IP addr: 192.168.5.114 was added to Device Collection [1]
[23-06-05 09:55:14.926563] Tid: 003972 warning [getAdapterInfo:430] Adapter 以太网 3 luidIdx 0x8006 is not Up
[23-06-05 09:55:14.929496] Tid: 003972 debug [getAdapterInfo:473] Adapter 以太网 3 vlanId 0 len 6 MAC 04:3f:72:a4:99:95
[23-06-05 09:55:14.933520] Tid: 003972 debug [getAdapterInfo:511] LUID 6008006000000 0x15b3/0x101d dpcp_adapter 0x1d173e121e0 opened true ret 0
[23-06-05 09:55:14.934033] Tid: 003972 debug [getAdapterInfo:528] IP: 169.254.42.137 VLAN_ID: 0 Serial number: MT2035X03236
[23-06-05 09:55:14.934059] Tid: 003972 debug [getAdapterInfo:530] MTU: 1500 TXlinkSpeed: 18446744073 Gbps RXLinkSpeed:18446744073 Gbps
[23-06-05 09:55:14.934267] Tid: 003972 info [getAdapterInfo:535] Device with IP addr: 169.254.42.137 was added to Device Collection [2]
[23-06-05 09:55:14.936009] Tid: 003972 debug [getAdapterInfo:473] Adapter 以太网 vlanId 0 len 6 MAC 04:d4:c4:06:38:25
[23-06-05 09:55:14.936036] Tid: 003972 debug [getAdapterInfo:515] DPCP device with LUID 6008001000000 not found!
[23-06-05 09:55:14.936236] Tid: 003972 info [~winDevice:97] ~winDevice DTOR
[23-06-05 09:55:14.938938] Tid: 003972 debug [getAdapterInfo:473] Adapter Loopback Pseudo-Interface 1 vlanId 0 len 0 MAC 00:00:00:00:00:00
[23-06-05 09:55:14.943206] Tid: 003972 debug [GetPhysicalAdapterByMAC:358] No physical device found, bypassing
[23-06-05 09:55:14.943267] Tid: 003972 debug [getAdapterInfo:486] Physical adapter GUID wasn’t found, bypassing
[23-06-05 09:55:14.943536] Tid: 003972 info [~winDevice:97] ~winDevice DTOR
[23-06-05 09:55:14.969048] Tid: 003972 info [license_validate_v4:446] Licensed to: Beijing Enlightv Co., Ltd (N/A), evaluation period expires in 25 days
[23-06-05 09:55:14.969122] Tid: 003972 info [info_product:466] Rivermax license version: 4.1
[23-06-05 09:55:14.969834] Tid: 003972 info [license_validate:516] Rivermax license id 827d7712-80a4-1938-6474-902c070f7f24, revision 1
[23-06-05 09:55:14.970095] Tid: 003972 info [rmax_init:638] Statistics disabled
[23-06-05 09:55:14.970294] Tid: 003972 info [cuda_enable_etbl:362] Starting Cuda init
[23-06-05 09:55:14.970509] Tid: 003972 info [cuda_enable_etbl:396] Cuda init Done
List of supported devices:
Device with interface name: 以太网 3, IP addresses: [ 169.254.42.137 ], MAC address: 04:3f:72:a4:99:95, device_id: 4125, serial number: MT2035X03236
Device with interface name: 以太网 2, IP addresses: [ 192.168.5.114 ], MAC address: 04:3f:72:a4:99:94, device_id: 4125, serial number: MT2035X03236
[23-06-05 09:55:14.972290] Tid: 003972 debug [Clock:31]
[23-06-05 09:55:14.972649] Tid: 003972 debug [ExternalClock:66]
[23-06-05 09:55:14.972669] Tid: 003972 debug [~SysClock:46]
[23-06-05 09:55:14.972683] Tid: 003972 debug [~Clock:36]
[23-06-05 09:55:14.972871] Tid: 003972 debug [rmx_use_user_clock_v1:324] Using user time handler
TX Thread: 0 Mask: 0x4
CUDA memory allocation on GPU - cuMemCreate
RDMA is supported and enabled, status
cuMemSetAccess failed status = 999
CUDA cudaFreeMmap 304600000
CUDA memory free finished with status 1
Failed to allocate GPU memory.
Failed to allocate memory on GPU id 0
Failed to allocate memory with size: 62914560
thread 0 initialization failed
[23-06-05 09:55:15.060593] Tid: 003972 info [rmax_cleanup:689] Cleanup called
[23-06-05 09:55:15.060819] Tid: 003972 debug [~EventHandlerManagerHigh:265]
[23-06-05 09:55:15.061004] Tid: 003972 debug [~EventHandlerManager:139]
[23-06-05 09:55:15.061203] Tid: 003972 debug [free_evh_resources:148]
[23-06-05 09:55:15.061479] Tid: 003972 debug [stop_thread:163] event handler thread stopped
[23-06-05 09:55:15.061606] Tid: 003972 debug [free_evh_resources:152] Thread stopped
[23-06-05 09:55:15.061908] Tid: 003972 debug [~ExternalClock:71]
[23-06-05 09:55:15.062005] Tid: 003972 debug [~Clock:36]
[23-06-05 09:55:15.062205] Tid: 003972 debug [~DeviceCollection:28] ~DeviceCollection()
[23-06-05 09:55:15.062406] Tid: 003972 info [~winDevice:97] ~winDevice DTOR
[23-06-05 09:55:15.062918] Tid: 003972 info [~winDevice:97] ~winDevice DTOR
[23-06-05 09:55:15.063158] Tid: 003972 info [~RiverLogger:105] logger closing

Dear @wanglx

One of the possible cause might be a low BAR size. You can try to implement the following steps:

  • Download the DisplayModeSelector application from on of the following links:

https://developer.nvidia.com/nvidia-display-mode-selector-tool-home
https://apps.nvidia.com/pid/contentlibraries/detail?id=1066046

  • Increase the VBIOS

Read the “NVIDIA Display Mode Selector Tool User Guide” for exact commands to increase the BAR size to 8G https://developer.nvidia.com/sites/default/files/akamai/NVIDIA_Display_Mode_Selector_Tool_User_Guide.pdf

.\displaymodeselector.exe --gpumode
A warning message appears. Press the Y key to continue.
Select mode 2 (physical_display_enabled_8GB_bar1)

  • Reboot

If it doesn’t work, I advise you to open a new case in Enterprise Support for detailed inspection.

Good luck,
Vladislav

1 Like

Dear vkhomyakov

Thanks for your reply!

I got follows logs from displaymodeselector.exe, it means RTX A4000 is not supported?


Are you sure you want to continue?
Press ‘y’ to confirm (any other key to abort):
y
Select a number:
<0> physical_display_enabled_256MB_bar1
<1> physical_display_disabled
<2> physical_display_enabled_8GB_bar1

Select a number (ESC to quit):
2
Specifed GPU Mode “physical_display_enabled_8GB_bar1”

Update GPU Mode of all adapters to “physical_display_enabled_8GB_bar1”?
Press ‘y’ to confirm or ‘n’ to choose adapters or any other key to abort:
y

Updating GPU Mode of all eligible adapters to “physical_display_enabled_8GB_bar1”

NVIDIA RTX A4000 (10DE,24B0,10DE,14AD) S:00,B:01,D:00,F:00

Specified GPU mode not supported on this device 0x24B0.

Dear @wanglx

Yes, I’ve double checked Display Mode Selector Tool docs and confirm the following:
The Display Mode Selector tool is a special tool for NVIDIA L40, NVIDIA RTX 6000 Ada, NVIDIA A40, NVIDIA RTX A5000, NVIDIA RTX A5500, and NVIDIA RTX A6000 only. It should not be used with any other GPU.

But has Windows GPUDirect supported on RTX A4000.
Could you please paste here nvidia-smi command output?

Would you mind to open the case in Enterprise Support for further debugging with Rivermax Application Engineering?

Regards,
Vladislav

Dear vkhomyakov

The issue has been resolved.

I have modified BIOS to support resizable BAR, it work fine for me

Thanks

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.