I have following compute shader:
#version 460 core
#extension GL_NV_shader_sm_builtins : require
layout(set = 0, binding = 0) uniform writeonly image2D image;
layout(push_constant) uniform Settings
{
layout(offset = 0) uint width;
layout(offset = 4) uint height;
}
settings;
layout(local_size_x = 32, local_size_y = 32, local_size_z = 1) in;
void main()
{
const uvec3 xyz = gl_GlobalInvocationID;
const uint width = settings.width;
const uint height = settings.height;
const uint x = xyz.x;
const uint y = xyz.y;
if (x >= width)
{
return;
}
if (y >= height)
{
return;
}
const uint smid = gl_SMIDNV;
const uint sm_count = gl_SMCountNV;
const uint warp_id = gl_WarpIDNV;
const uint warps_per_sm = gl_WarpsPerSMNV;
const float sm_vis = float(smid) / float(sm_count - 1U);
const float warp_vis = float(warp_id) / float(warps_per_sm - 1U);
imageStore(image, ivec2(x, y), vec4(sm_vis, warp_vis, 0, 0));
}
When I run it on RTX 5070 Ti sometimes it runs producing invalid values but most of the time it crashes and I get error:
Exception thrown at 0x00007FFECCF90769 (nvgpucomp64.dll) in app.exe: 0xC0000005: Access violation writing location 0x000002B5D418BDD4.
When it does not crash and produces wrong output only first image channel is affected (one that sm_vis is written to). I menaged to run app under NSight Graphics and when it does not crash second channel (one with warp_vis) is all filled with zeros while first has values ranging from 0 to 1 but when I change imageStore(image, ivec2(x, y), vec4(sm_vis, warp_vis, 0, 0)); to imageStore(image, ivec2(x, y), vec4(warp_vis, sm_vis, 0, 0)); only second channel has correct values (greenish output) and first one is full with zeros.
When i only use 1 channel app never crashes and produces correct result (either visualizing warps or SMs).
Backtrace shows that it comes from vkCreateComputePipelines and after that call stack trace looks like:
Same shader works perfectly fine on RTX3060.
