Segmentation fault when compile simple kernel

SparkHu · May 29, 2024, 3:38am

Code:

#include <cuda_runtime.h>
#include <cstdint>
__global__
void d_kernel(uint32_t *inout){
    uint32_t v = inout[threadIdx.x];
    uint32_t v1 = v;
    for (uint32_t i = 0; i < 10000000; i++) {
        v *= v;
        v1 = v1 + v;
    }
    inout[threadIdx.x] = v + v1;
}

int main() {
    d_kernel<<<1,1>>>(nullptr);
}

Compiler:
nvcc: NVIDIA (R) Cuda compiler driver

Built on Wed_Nov_22_10:17:15_PST_2023

Cuda compilation tools, release 12.3, V12.3.107

Build cuda_12.3.r12.3/compiler.33567101_0

OS:
Ubuntu 22.04.4 LTS

Full command:
nvcc -o main -arch=sm_86 XX.cu

Robert_Crovella · May 29, 2024, 3:10pm

I suggest:

retest on the latest CUDA (12.5, currently)
if it still fails, file a bug.

Yuki_Ni · May 30, 2024, 6:38am

This is reported to bug ID 4675651

[Public] Hi xxxx

Thanks for filing a bug ticket . I can initially reproduce this in house . Our compiler engineering team will investigate the issue . We will keep you informed .

Best,
Yuki

Yuki_Ni · June 14, 2024, 2:35am

[Public] Hi

Credit to our compiler engineering team . The issue is fixed and verified in house . This fix will be part of next second CUDA 12.x release . Thanks again for reporting this to us .

Best,
Yuki

system · June 28, 2024, 2:36am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.