CUDA Debugger detected HW exception


While debugging my CUDA code, I detected for the first time the following error:

    CUDA Debugger detected HW exception on 1 warps.  First warp:
    blockIdx = {7,0,0}
    threadIdx = {288,0,0}
    Exception = Out of range Address
    PC = 0x002f5c08
    FunctionRelativePC = _Z24gpuDewarpAndRescaleNaivePhy6float46float2+001208

In the CUDA Info window under Status column it appears as “Exception” and Exception details colums as “OutOfRangeAddress”. Of course, during different runs the exact blockIdx and threadIdx differ.

These exceptions happen during texture sampling by:

    char val = 255.0f * tex2D<float>(texObj, xCoord, yCoord) + 0.5f;

the texture is created with the following properties:

resDesc.resType = cudaResourceTypePitch2D;
resDesc.res.pitch2D.devPtr = buffer;
resDesc.res.pitch2D.desc.f = cudaChannelFormatKindUnsigned;
resDesc.res.pitch2D.desc.x = 8;
resDesc.res.pitch2D.width = width;
resDesc.res.pitch2D.height = height;
resDesc.res.pitch2D.pitchInBytes = pitch;

// texture
cudaTextureDesc texDesc;
memset(&texDesc, 0, sizeof(texDesc));
texDesc.addressMode[0] = cudaAddressModeClamp;
texDesc.addressMode[1] = cudaAddressModeClamp;
texDesc.filterMode = cudaFilterModeLinear;
texDesc.readMode = cudaReadModeNormalizedFloat;
texDesc.normalizedCoords = false;

and the sampling coordinates look perfectly inside the texture (although it should not matter due to out-of-region defined policy)

in the debugger it stops at the following function in “texture_indirect_functions.h”

template <class T>
__TEXTURE_INDIRECT_FUNCTIONS_DECL__ T tex2D(cudaTextureObject_t texObject, float x, float y)
  T ret;
  tex2D(&ret, texObject, x, y);
  return ret;

Again, the “x” and “y” are normal, moreover, the whole kernel run produces correct sampling results (vs. pure CPU version).

[b]My questions are:

  1. is this type of exception is a normal situation during texture sampling?
  2. what is the meaning of this type of exception during texture sampling (maybe the portion of texture is not in L1 cache, etc.)[/b]

P.S. My GPU is GTX 1070

Thanks in advance,


no it’s not normal.

Is this on windows?

Yes, the development machine is Windows 7 64-bit, driver version 384.94.

One important thing that I should mention is that I have two cards installed,
Quadro K620 (for monitors) and
GTX 1070 for testing CUDA applications.

What can cause such a problem?

I wouldn’t be able to say much without a full test case. It could be a bug in the code, in CUDA runtime, or a bug in the tools.

Is is possible to provide Nvidia with a source code (one .cu file) in which I run into this problem? Will someone at Nvidia look at it?

You can file a bug at

Generally speaking, NVIDIA looks at all developer filed bugs. But I make no guarantees about schedules, progress, outcomes, or results.

Thanks for the answers.

Unfortunately, I could not locate any link there for filing a bug.

I would be very thankful if you can provide the exact page link for bug submissions.

Thanks again,


You need to be a registered developer. Once you are a registered developer, go to the my account area (click on the dropdown by your name in the upper right hand corner, select my account). Then click on my bugs. Then you will see a button in the upper right hand corner to submit a new bug.