You might want to read my original reply in this thread. To recap
There is no dedicated texture memory in current architectures. Textures are stored as global memory which is bound to a given texture reference. CUDA provides an API for this.
The texture units have a small read cache, and that cache often provides some speed up over reading from global memory, although cache misses can hurt performance depending on the spatial organization of the data bound to a texture and access patterns.
Texture references provide a method for CUDA kernels running on the shader cores to transparently fork texture unit threads and access data bound as textures. This data access can include filtering and interpolation, and as of CUDA 2.x can be done in 1D, 2D, or 3D.