Why does cudaMallocMipmappedArray not result in device memory?

I am trying to get CUDA mipmapped arrays to work, but am encountering an invalid argument error when performing a device to device cudaMemcpy2D. When changing the memcpy flag from DeviceToDevice to Default it however got past the function.

This made me investigate the types of memory I was feeding it. My source pointer was indeed device memory, and the destination pointer is the mipmapped array. However, this mipmapped array is noted down as unregistered host memory.

I made a little test program to rule out external influence and indeed it reports the same:

#include "cuda_runtime.h"

#include <iostream>

//Macro for checking cuda errors following a cuda launch or api call
#define CudaCheck(ans) { cudaCheck((ans), __FILE__, __LINE__); }
inline void cudaCheck(cudaError_t code, const char* file, int line, bool abort = true)
{
    if (code != cudaSuccess)
    {
        std::cerr << "Cuda Check Failed: " << cudaGetErrorString(code) << " " << file << " " << line << std::endl;
        if (abort) exit(code);
    }
}

int main()
{
    cudaMipmappedArray_t mipArray;
    cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(32, 32, 32, 32, cudaChannelFormatKindFloat);
    int mipLevels = 5;
    cudaExtent extent = make_cudaExtent(1024, 1024, 0);
    CudaCheck(cudaMallocMipmappedArray(&mipArray, &channelDesc, extent, mipLevels));

    cudaDeviceProp prop;
    CudaCheck(cudaGetDeviceProperties(&prop, 0));
    printf("Unified Virtual Addressing %i\n", prop.unifiedAddressing);

    cudaPointerAttributes attributes;
    CudaCheck(cudaPointerGetAttributes(&attributes, mipArray));
    printf("Memory type for data %i\n", attributes.type);
}

reports:

Unified Virtual Addressing 1
Memory type for data 0

where the memory type: 0 corresponds to:

enum __device_builtin__ cudaMemoryType
{
    cudaMemoryTypeUnregistered = 0, /**< Unregistered memory */
    cudaMemoryTypeHost         = 1, /**< Host memory */
    cudaMemoryTypeDevice       = 2, /**< Device memory */
    cudaMemoryTypeManaged      = 3  /**< Managed memory */
};

How come cudaMallocMipmappedArray does not allocate device memory as it states in the documentation?

the handle returned by cudaArray creation is not a pointer that can be introspected this way. The underlying allocated memory is device memory, not host memory.

If you’d like to see a change in this behavior, I suggest filing a bug.

If you are attempting to perform a device to device dopy from an ordinary allocation to a cudaArray allocation, you should not be using cudaMemcpy2D but instead cudaMemcpy2DToArray.