Rivermax sdk example code run "CQE error"

Hi expert

I am using rivermax sdk example codes, but get “CQE error” failed

CODE: media_sender.exe(Release-CUDA)
GPU: NVIDIA RTX A4000
CUDA: 12.1
Driver: 531.14
wiindows: 10 rpo 21H2

RUN CMD: .\media_sender.exe -c 1 -a 2 -s .\sdps_samples\sdp_2110-20_narrow_gap_1080p50fps.txt -g 0 -r --max_gpu_freq

/logs as follows*****************/

PS D:\wlx\Rivermax\tests> .\media_sender.exe -c 1 -a 2 -s .\sdps_samples\sdp_2110-20_narrow_gap_1080p50fps.txt -g 0 -r --max_gpu_freq
#############################################

Rivermax SDK version: 1.30.16
Media sender version: 1.30.16
#############################################
Set env variable CUDA_DEVICE_ORDER=PCI_BUS_ID
gpu_device_id = 0
Writing log to default location: C:\Users\enlightv\AppData\Local\Temp\rivermax_0606_181623_3736.log
Created log file: C:\Users\enlightv\AppData\Local\Temp\rivermax_0606_181623_3736.log
[23-06-06 18:16:23.411369] Tid: 001016 info [InitLogger:92] Logger started
[23-06-06 18:16:23.411436] Tid: 001016 info [rmax_init:610] starting Rivermax: SDK version 1.30.16
[23-06-06 18:16:23.412164] Tid: 001016 debug [Clock:31]
[23-06-06 18:16:23.412218] Tid: 001016 debug [SysClock:41]
[23-06-06 18:16:23.412272] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_DISABLE_VIDEO_GROUPING to the value true
[23-06-06 18:16:23.412327] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_VIDEO_PACE_INTERVAL to the value 1000000
[23-06-06 18:16:23.412402] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_OUT_STREAM_SIZE_IN_PKTS to the value 32768
[23-06-06 18:16:23.412478] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_HEADER_STRIDE_SIZE to the value 64
[23-06-06 18:16:23.412542] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_DISABLE_FLOW_ID to the value false
[23-06-06 18:16:23.412615] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_SDP_PARSER_ENABLE_LOGGING to the value true
[23-06-06 18:16:23.412672] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_PTP_HW_RT_CLOCK to the value false
[23-06-06 18:16:23.412740] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_CUDA to the value true
[23-06-06 18:16:23.412796] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_STATISTICS to the value false
[23-06-06 18:16:23.412854] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_API_VERIFICATION to the value false
[23-06-06 18:16:23.412909] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_DISABLE_AUDIO_BUFFERING to the value false
[23-06-06 18:16:23.412968] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_SESSION_MAP_SIZE to the value 2000
[23-06-06 18:16:23.413027] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_SESSION_MAP_SIZE to the value 2000
[23-06-06 18:16:23.413266] Tid: 001016 debug [EventHandlerManager:125]
[23-06-06 18:16:23.413293] Tid: 001016 info [EventHandlerManager:132] will wakeup before frame begin event in 2000000 ns
[23-06-06 18:16:23.413332] Tid: 001016 debug [EventHandlerManagerHigh:259]
[23-06-06 18:16:23.413381] Tid: 001016 debug [start_thread:341] Starting internal thread
[23-06-06 18:16:23.413462] Tid: 001016 debug [rivermax_set_thread_affinity:719] successfully set thread affinity using cpu mask: 0x2, previous mask: 0xff
[23-06-06 18:16:23.413499] Tid: 001016 debug [start_thread:344] Started event handler thread
[23-06-06 18:16:23.413564] Tid: 001016 debug [init_globals:249] Time now is 1686046583413563900
[23-06-06 18:16:23.413732] Tid: 006880 info [print_thread_info:117] High priority internal thread: PID = 3736, thread ID = 6880
[23-06-06 18:16:23.414238] Tid: 001016 debug [load_provider:69] dpcp[0] = 6008006000000 ‘Mellanox ConnectX-6 Dx Adapter’
[23-06-06 18:16:23.414264] Tid: 001016 debug [load_provider:69] dpcp[1] = 6008007000000 ‘Mellanox ConnectX-6 Dx Adapter #2
[23-06-06 18:16:23.414289] Tid: 001016 info [init:37] DPCP/DevX provider was loaded
[23-06-06 18:16:23.461965] Tid: 001016 debug [getAdapterInfo:473] Adapter 以太网 3 vlanId 0 len 6 MAC 04:3f:72:a4:99:90
[23-06-06 18:16:23.465444] Tid: 001016 debug [getAdapterInfo:511] LUID 6008006000000 0x15b3/0x101d dpcp_adapter 0x2051b5b1be0 opened true ret 0
[23-06-06 18:16:23.465551] Tid: 001016 debug [getAdapterInfo:528] IP: 192.168.5.44 VLAN_ID: 0 Serial number: MT2035X03235
[23-06-06 18:16:23.465608] Tid: 001016 debug [getAdapterInfo:530] MTU: 1500 TXlinkSpeed: 100 Gbps RXLinkSpeed:100 Gbps
[23-06-06 18:16:23.465669] Tid: 001016 info [getAdapterInfo:535] Device with IP addr: 192.168.5.44 was added to Device Collection [1]
[23-06-06 18:16:23.465724] Tid: 001016 warning [getAdapterInfo:430] Adapter 以太网 4 luidIdx 0x8007 is not Up
[23-06-06 18:16:23.468125] Tid: 001016 debug [getAdapterInfo:473] Adapter 以太网 4 vlanId 0 len 6 MAC 04:3f:72:a4:99:91
[23-06-06 18:16:23.471336] Tid: 001016 debug [getAdapterInfo:511] LUID 6008007000000 0x15b3/0x101d dpcp_adapter 0x2051b5b1cc0 opened true ret 0
[23-06-06 18:16:23.471519] Tid: 001016 debug [getAdapterInfo:528] IP: 169.254.90.23 VLAN_ID: 0 Serial number: MT2035X03235
[23-06-06 18:16:23.471575] Tid: 001016 debug [getAdapterInfo:530] MTU: 1500 TXlinkSpeed: 18446744073 Gbps RXLinkSpeed:18446744073 Gbps
[23-06-06 18:16:23.471633] Tid: 001016 info [getAdapterInfo:535] Device with IP addr: 169.254.90.23 was added to Device Collection [2]
[23-06-06 18:16:23.474342] Tid: 001016 debug [getAdapterInfo:473] Adapter 以太网 2 vlanId 0 len 6 MAC d4:5d:64:d2:c1:49
[23-06-06 18:16:23.474400] Tid: 001016 debug [getAdapterInfo:515] DPCP device with LUID 6008005000000 not found!
[23-06-06 18:16:23.474454] Tid: 001016 info [~winDevice:97] ~winDevice DTOR
[23-06-06 18:16:23.477058] Tid: 001016 debug [getAdapterInfo:473] Adapter Loopback Pseudo-Interface 1 vlanId 0 len 0 MAC 00:00:00:00:00:00
[23-06-06 18:16:23.481534] Tid: 001016 debug [GetPhysicalAdapterByMAC:358] No physical device found, bypassing
[23-06-06 18:16:23.481591] Tid: 001016 debug [getAdapterInfo:486] Physical adapter GUID wasn’t found, bypassing
[23-06-06 18:16:23.481648] Tid: 001016 info [~winDevice:97] ~winDevice DTOR
[23-06-06 18:16:23.505503] Tid: 001016 info [license_validate_v4:446] Licensed to: Beijing Enlightv Co., Ltd (N/A), evaluation period expires in 24 days
[23-06-06 18:16:23.505603] Tid: 001016 info [info_product:466] Rivermax license version: 4.1
[23-06-06 18:16:23.506273] Tid: 001016 info [license_validate:516] Rivermax license id 827d7712-80a4-1938-6474-902c070f7f24, revision 1
[23-06-06 18:16:23.506337] Tid: 001016 info [rmax_init:638] Statistics disabled
[23-06-06 18:16:23.506408] Tid: 001016 info [cuda_enable_etbl:362] Starting Cuda init
[23-06-06 18:16:23.506499] Tid: 001016 info [cuda_enable_etbl:396] Cuda init Done
List of supported devices:
Device with interface name: 以太网 3, IP addresses: [ 192.168.5.44 ], MAC address: 04:3f:72:a4:99:90, device_id: 4125, serial number: MT2035X03235
Device with interface name: 以太网 4, IP addresses: [ 169.254.90.23 ], MAC address: 04:3f:72:a4:99:91, device_id: 4125, serial number: MT2035X03235
[23-06-06 18:16:23.508224] Tid: 001016 debug [Clock:31]
[23-06-06 18:16:23.508254] Tid: 001016 debug [ExternalClock:66]
[23-06-06 18:16:23.508275] Tid: 001016 debug [~SysClock:46]
[23-06-06 18:16:23.508298] Tid: 001016 debug [~Clock:36]
[23-06-06 18:16:23.508321] Tid: 001016 debug [rmx_use_user_clock_v1:324] Using user time handler
TX Thread: 0 Mask: 0x4
@@@cudaAllocateMmap:0 size:25165824 align:0
CUDA memory allocation on GPU - cuMemCreate
RDMA is supported and enabled, status
CUDA memory allocation on GPU - cuMemCreate Done
GPU allocation succeeded, GPU id = 0 ,size = 25165824
Note: Allocation using huge pages size requested 1105920 is smaller then one page size: 2097152
Allocated 2097152 bytes using Large Pages
sdp for stream 0 is:
v=0
o=- 1443716955 1443716955 IN IP4 192.168.5.44
s=SMPTE ST2110-20 narrow gap 1080p50
t=0 0
m=video 2000 RTP/AVP 96
c=IN IP4 224.1.1.1/64
a=source-filter: incl IN IP4 224.1.1.1 192.168.5.44
a=rtpmap:96 raw/90000
a=fmtp:96 sampling=YCbCr-4:2:2; width=1920; height=1080; exactframerate=50; depth=10; TCS=SDR; colorimetry=BT709; PM=2110GPM; SSN=ST2110-20:2017; TP=2110TPN; TSMODE=SAMP; TSDELAY=0
a=mediaclk:direct=0
a=ts-refclk:localmac=40-a3-6b-a0-2b-d2

[23-06-06 18:16:23.581104] Tid: 001016 info [init_large_pages:35] huge pages are supported with page size 2097152
[23-06-06 18:16:23.581143] Tid: 001016 debug [hugePageAlloc:67] allocted 2097152 memory at 0x20532a00000 factor 1 allocSize 2097152
[23-06-06 18:16:23.581172] Tid: 001016 debug [rivermax_get_user_env:151] parsed env RIVERMAX_ENABLE_MP_WQE to the value false
[23-06-06 18:16:23.581196] Tid: 001016 debug [SessionTX:86] MP_WQE disabled for session
[23-06-06 18:16:23.581230] Tid: 001016 info [sdp_parse:594] trying to parse using smpte2110…
[23-06-06 18:16:23.581280] Tid: 001016 info [sdp_parse:610] sdp parsed successfully
[23-06-06 18:16:23.581305] Tid: 001016 info [license_assert_device:719] Validating Rivermax license for device with local ip 192.168.5.44
[23-06-06 18:16:23.581328] Tid: 001016 info [is_sn_matched:144] No serial number restriction
[23-06-06 18:16:23.581349] Tid: 001016 debug [session_tx_initialization:1001] got 4 blocks, 16 stride in chunk, 4320 packets per frame, network_len 46
[23-06-06 18:16:23.581373] Tid: 001016 debug [session_tx_initialization:1024] processing block 0 with 4320 packets
[23-06-06 18:16:23.581393] Tid: 001016 debug [session_tx_initialization:1026] processing application header
[23-06-06 18:16:23.581420] Tid: 001016 debug [session_tx_initialization:1024] processing block 1 with 4320 packets
[23-06-06 18:16:23.581440] Tid: 001016 debug [session_tx_initialization:1026] processing application header
[23-06-06 18:16:23.581467] Tid: 001016 debug [session_tx_initialization:1024] processing block 2 with 4320 packets
[23-06-06 18:16:23.581486] Tid: 001016 debug [session_tx_initialization:1026] processing application header
[23-06-06 18:16:23.581514] Tid: 001016 debug [session_tx_initialization:1024] processing block 3 with 4320 packets
[23-06-06 18:16:23.581535] Tid: 001016 debug [session_tx_initialization:1026] processing application header
[23-06-06 18:16:23.581629] Tid: 001016 debug [session_tx_initialization:1088] fix intv is every 50 frames
[23-06-06 18:16:23.581653] Tid: 001016 info [session_tx_initialization:1166] Detected ST2110-20 video stream
[23-06-06 18:16:23.581678] Tid: 001016 debug [init:85] MP_WQE disabled for ring
[23-06-06 18:16:23.581699] Tid: 001016 info [init:20] do open: true
[23-06-06 18:16:23.581950] Tid: 001016 debug [init:344] cpu_vec 0x0 eqn 7
[23-06-06 18:16:23.581981] Tid: 001016 debug [init:354] Reserved MKey created lkey=0x700 addr=0x2051b5c9e28
[23-06-06 18:16:23.582002] Tid: 001016 debug [init:362] Adapter frequency (khz) 1000000
[23-06-06 18:16:23.582021] Tid: 001016 debug [init:366] DPP supported is enabled
[23-06-06 18:16:23.582040] Tid: 001016 debug [calculate:259] rate 2271600 pps 216000, DI 0.000000, burst size 1262 inter burst gap 4444.44 accurate ibg 4444.44 active_time 0.96 , inter packet_gap 4444.44
[23-06-06 18:16:23.582100] Tid: 001016 debug [create_comp_channel:163] created completion channel: 0x205280e6150 with handle 0x32c
[23-06-06 18:16:23.583416] Tid: 001016 debug [create_cq:207] created CQ sz 32768 cqn 0x434
[23-06-06 18:16:23.583438] Tid: 001016 debug [get_dv_cq:305] CQ id 0x434
[23-06-06 18:16:23.622416] Tid: 001016 debug [create_pp_sq:1056] created packet pacing SQ 0x20519802d50 state SQ_RDY status 0 wqe 0x20532e16000 stride num/sz 32768/64 sq 4350
[23-06-06 18:16:23.622522] Tid: 001016 debug [create_cq_sq:316] got prm cq buf 0x20532c06000 sq buf 0x20532e16000
[23-06-06 18:16:23.623257] Tid: 001016 debug [SenderSG:53] SQ num 0x10fe buf 0x20532e16000 stride 64 cnt 32768 dummyInt 0 extra_dummy 0
[23-06-06 18:16:23.623337] Tid: 001016 debug [Mlx5Poll:34] cq num 0x434 cqe size 64 cq size 32768 cqn 1076 dbrec 0x20527f8ff40
[23-06-06 18:16:23.623914] Tid: 001016 debug [bind:180] called bind to ip 192.168.5.44 port 52894
[23-06-06 18:16:23.624036] Tid: 001016 debug [fill_net_header:103] Final DstMAC=01:00:5e:01:01:01 vlan_id=0
[23-06-06 18:16:23.624096] Tid: 001016 debug [fill_net_header:120] DSCP=0
[23-06-06 18:16:23.624157] Tid: 001016 debug [fill_net_header:122] ECN=0
[23-06-06 18:16:23.624213] Tid: 001016 debug [fill_headers:188] Resolved Src: 192.168.5.44 to SrcMAC=04:3f:72:a4:99:90 Dst: 224.1.1.1 to DstMAC=01:00:5e:01:01:01 VLANId=0
[23-06-06 18:16:23.624294] Tid: 001016 debug [prepare_headers:119] network len 42 max_usr_hdr 64 stride length is 128
[23-06-06 18:16:23.624370] Tid: 001016 debug [hugePageAlloc:67] allocted 4194304 memory at 0x20535000000 factor 2 allocSize 4194304
[23-06-06 18:16:23.624917] Tid: 001016 debug [create_direct_mkey:487] map sz = 2 lkey 0x4948
[23-06-06 18:16:23.624975] Tid: 001016 debug [prepare_headers:140] done preparing raw network header in address 0x20535000000 with size 2211840 lkey 0x4948 total header allocated 17280
[23-06-06 18:16:23.625052] Tid: 001016 debug [session_tx_initialization:1415] calculated 180 DI in gap, one extra dummy every 1.7053e-13 frames
[23-06-06 18:16:23.625124] Tid: 001016 debug [ChunkMgr:44] creating chunkmgr mem_block_array_len: 4 m_chunk_size_in_stride: 16 data_stride_size: 1280, app header 64
[23-06-06 18:16:23.625479] Tid: 001016 debug [create_direct_mkey:487] map sz = 3 lkey 0x4a49
[23-06-06 18:16:23.626055] Tid: 001016 debug [create_direct_mkey:487] map sz = 4 lkey 0x4b4a
[23-06-06 18:16:23.626749] Tid: 001016 debug [create_direct_mkey:487] map sz = 5 lkey 0x4c4b
[23-06-06 18:16:23.627514] Tid: 001016 debug [create_direct_mkey:487] map sz = 6 lkey 0x4d4c
[23-06-06 18:16:23.627799] Tid: 001016 info [disable_mp_wqe:104] MP WQE disabled for ring 0x2052836b2b0
[23-06-06 18:16:23.627864] Tid: 001016 debug [ChunkMgr:254] MP_WQE disabled for ring 0
[23-06-06 18:16:23.627892] Tid: 001016 debug [add_tx_session_to_map:36] created new TX session with id 0
[23-06-06 18:16:23.627967] Tid: 001016 debug [add_tx_session_to_map:40] adding session 0 to map with period 2e+07
Stream ID: 0
Source: 192.168.5.44:52894
Destination: 224.1.1.1:2000
Successfully set thread affinity using cpu mask: 0x4, previous mask: 0xff
running 1 streams, each mem_block using 270 chunks, each frame has 270 chunks, each chunk has 16 strides, sending 4320 packets per frame, 50 frames per second, frame/field duration: 20000 [us]
running scenario with: chunks in frame: 270 chunks in mem_block: 270 strides in chunk : 16 first commit in ms: 1010.941440
[23-06-06 18:16:24.653137] Tid: 009628 error [poll:55] idx 1 wqe id 0 CQE error, vendor syndrome=0x51, HW syndrome=0x2, HW syndrome type=0x0 syndrome=0x4
[23-06-06 18:16:24.653264] Tid: 009628 error [poll:57] send_code 0xfe wqe_cnt 0 user_idx 0x434

Save Edit
Close

Dear @wanglx

Thanks for your post.

It looks like a GPU problem. Unfortunately I can’t provide you more details right now, but I’ll do it later once I have updates.

As an alternative option you can open the case in Enterprise Support and we’ll do our best to track and debug the issue.

Regards,
Vladislav