Negative texture cache hit rate !?


I’m testing memory reading program. Multiple threads concurrently read multiple buffers sequentially.

I utilized texture memory to enhance reading performance.

texture<uint8_t, cudaTextureType1D, cudaReadModeElementType> texRef;

cudaBindTexture (NULL, texRef, data, copy_size);

I accessed texture memory like this.

ch = tex1Dfetch (texRef, base + i);

However, the performance degraded compared to the case using global memory.

I looked at memory occupancy analysis in compute profiler, and the texture cache hit rate shows negative value! (-5 %)

I cannot figure it out what happens by myself. Can anybody help me?



I’ve seen the negative hit rate percentage too and also have no idea what is going on here. Hopefully someone from NV will chime in to this.

I have faced with this situation.

The reason of getting negative number is the way this number to be calculated: (cache_requests - cache_misses)/cache_requests.

So the reason is cache_misses value is larger than cache_requests. Strange ;)

I can only suggest that in my case (I fetch uint2) and in yours’ the fetched data from the texture cache has more then 32 bit. As the result, one request (e.g. tex1Dfetch) split internally into several low-level requests.

How is it possible to solve this? that is the question.

Any comments from NVidia employee are highly required.


it seems I could answer my own question.

I used texture binded to 2D linear memory (cudaMallocPitch). When I switched to cudaArray, the situation became better.

Best regards,