Jetson TK1 - Disable ECC on Kepler

How can I disable ECC on the Tegra K1’s Kepler GPU? My research indicates that nvidia-smi is not supported on this platform.

Can anyone help or point me to some resources? I’m still not able to figure out this problem.

Hi MoonJP,

It’s the first time to see this requirement - "disable ECC on the Tegra K1’s Kepler GPU.
Would you please share more information for the purpose and use case?

Thanks

The application is high-performance embedded computing for aerospace. Up-time is more important to my application than the correct output for certain cases. The error correcting code (ECC) must be disabled to improve system reliability by reducing gpu crashing. This sounds strange, but the phenomenon is well documented. The ECC implemented on the gk20a local/shared memories is single-error correction double-error detection (SECDED). In the event of a single-error, ECC works well. However, in the event of a double-error, ECC crashes the kernel and the GPU. http://www4.ncsu.edu/~dtiwari2/Papers/2015_HPCA_Tiwari_GPU_Reliability.pdf

Hi MoonJP,

By further checking internally, the TK1 GPU (GK20A) don’t have ECC support.

You can check the return information about the device to confirm it, please read the value of cudaDevAttrEccEnabled by function cudaDeviceGetAttribute().

cudaDevAttrEccEnabled : 1 if error correction is enabled on the device, 0 if error correction is disabled or not supported by the device.

Thanks