Problem of SPDM session with CC mode in RHEL 9.4

I installed a guest VM with rhel 9.4 and installed the 550 driver. But the SPDM session between the driver and the GPU failed.

[ 6668.213028] NVRM: spdmGetCertificates_GH100: SPDM failed with status 0x80020000
[ 6668.213036] NVRM: spdmStart_IMPL: SPDM: Certificate retrieval failed!
[ 6668.213037] NVRM: spdmStart_IMPL: SPDM: Session establishment failed!
[ 6668.213040] NVRM: nvCheckOkFailedNoLog: Check failed: Failure: Generic Error [NV_ERR_GENERIC] (0x0000FFFF) returned from spdmStart(pGpu, pConfCompute->pSpdm) @ conf_compute.c:298
[ 6668.213041] NVRM: _kgspEstablishSpdmSession: SPDM handshake with Responder failed.
[ 6668.213042] NVRM: _kgspEstablishSpdmSession: Failed to establish session with SPDM Responder!
[ 6668.213046] NVRM: kfspDumpDebugState_GH100: FSP microcode v4.76
[ 6668.213047] NVRM: kfspDumpDebugState_GH100: GPU 0000:01:00
[ 6668.213049] NVRM: kfspDumpDebugState_GH100: NV_PFSP_FALCON_COMMON_SCRATCH_GROUP_2(0) = 0x0
[ 6668.213051] NVRM: kfspDumpDebugState_GH100: NV_PFSP_FALCON_COMMON_SCRATCH_GROUP_2(1) = 0x0
[ 6668.213053] NVRM: kfspDumpDebugState_GH100: NV_PFSP_FALCON_COMMON_SCRATCH_GROUP_2(2) = 0x0
[ 6668.213055] NVRM: kfspDumpDebugState_GH100: NV_PFSP_FALCON_COMMON_SCRATCH_GROUP_2(3) = 0x0
[ 6668.213057] NVRM: _kgspEstablishSpdmSession: NV_PGSP_FALCON_MAILBOX0 = 0xfafafafa
[ 6668.213059] NVRM: _kgspEstablishSpdmSession: NV_PGSP_FALCON_MAILBOX1 = 0x0
[ 6668.213061] NVRM: nvAssertOkFailedNoLog: Assertion failed: Failure: Generic Error [NV_ERR_GENERIC] (0x0000FFFF) returned from _kgspEstablishSpdmSession(pGpu, pKernelGsp, pCC) @ kernel_gsp_gh100.c:792
[ 6668.213504] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 6668.565589] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x62:0xffff:1784)
[ 6668.569123] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

RHEL 9.4 has the 5.14 Linux kernel. In the driver’s code, it seems to enable USE_LKCA to 1, it needs to have the crypto/internal/ecc.h, which has been moved until 5.16. Is this the problem that causes the 0x80020000 error with 5.14 kernel? Thanks.

Is this the problem that causes the 0x80020000 error with 5.14 kernel?

Yes, and upcoming releases will have more informative error messages. This one is rather indirect.

Thanks for the answer. Could you let me know whether the next release will support RHEL 9.4 with a 5.14 kernel?

with a 5.14 kernel?

It will not. AMD SEV-SNP requires at least 5.19 and Intel TDX requires 6.x.