cudaMallocManaged do not allocate on shared VRAM, but on dedicated VRAM

Despite it says that we have to call cudaMallocManaged to allocated on shared VRAM
Unified Memory for CUDA Beginners | NVIDIA Developer Blog
But cudaMallocHost do the job.

cudaMallocManaged :

cudaMallocHost :

I have a GTX 1060 3 GB

I think it does not work in that way. By default it is attached to GPU VRAM. It is copied to host when required.
But you can also attach it to host initially if you like .Here is some code from an application note.

cudaHostAlloc does pinned zero-copy host memory allocation that is accessible from GPU.

 void MatrixMul(int hp, int hq, int wp, int wq)
 {
    int *p, *q, *r;
    int i;
    size_t sizeP = hp * wp * sizeof(int);
    size_t sizeQ = hq * wq * sizeof(int);
    size_t sizeR = hp * wq * sizeof(int);

    //Attach buffers ‘p’ and ‘q’ to CPU and buffer ‘r’ to GPU
    cudaMallocManaged(&p, sizeP, cudaMemAttachHost);
    cudaMallocManaged(&q, sizeQ, cudaMemAttachHost);
    cudaMallocManaged(&r, sizeR);

    //Intialize with random values
    randFill(p, q, hp, wp, hq, wq);

    // Prefetch p,q to GPU as they are needed in computation
    cudaStreamAttachMemAsync(NULL, p, 0, cudaMemAttachGlobal);
    cudaStreamAttachMemAsync(NULL, q, 0, cudaMemAttachGlobal);

    matrixMul<<<....>>>(p, q, r, hp, hq, wp, wq);
    
    // Prefetch 'r' to CPU as only 'r' is needed
    cudaStreamAttachMemAsync(NULL, r, 0, cudaMemAttachHost);
    cudaStreamSynchronize(NULL);
  }