I am writing an application for jetsson tk1. I have got the impression from Mark Harris in his blog post
that the memory of the K1 is physically unified and I have also observed results indicating that cudaMallocManaged is significantly faster for global memory than ordinary cudaMemcpy, probably because of the unified memory. However what do I do when I want to use the texture memory for parts of my application? I have not found any support for textures using cudaMallocManaged so I have assumed that I have to use normal cudaMemcpyToArray and bindTextureToArray? If I do that I still get some issues. The “unified” variables I use sometimes give weird segmentation faults. This happens after some kernels even when I use cudaDeviceSynchronize(). Is this the right way to do it or is there another way of using texture memory along with unified memory?
I’d really appreciate some help