Page migration engine in UM

I was wondering which page fault type (minor or major) triggers the page-migration engine in the UM.
When accessing a unified memory page by the host, a page-fault is triggered because the page is allocated on the device. And the other way around.
Now, there’s no real page swapping from the disk when this happens, just page reallocating between host and device memory, so which is it? A major page fault or a minor one?


For that question to be answerable, one would probably at least need a definition of what major and minor page faults even mean in the context of a GPU.

Generalizing a bit from the wikipedia article:

Minor page fault:

"If the page is loaded in memory at the time the fault is generated, but is not marked in the memory management unit as being loaded in memory, then it is called a minor or soft page fault. The page fault handler in the operating system merely needs to make the entry for that page in the memory management unit point to the page in memory and indicate that the page is loaded in memory; it does not need to read the page into memory. "

We could distinguish between these two types based on whether they involve reading the page into memory from a backing store. In GPU UM usage, the backing store is the memory of another processor in the heterogeneous system, rather than secondary storage (disk) in the traditional sense/usage.

This sort of (“minor”) page-faulting may occur in a UM scenario after initial allocation of the page(s).

After calling cudaMallocManaged, if the cpu code touches the page (first) that was just allocated, then a CPU page fault will occur (this can be confirmed using a profiler). This sort of cpu page fault does not result in any UM migration (copying) of data. It is more like the lazy allocation process that the host operating system (may) use:

Lazy allocation results in allocated (i.e. reserved) but unmapped memory. The process of the CPU writing to this memory creates a mapping, whereafter it is ordinarily usable by the host code.

In this sense, I think you could refer to this sort of page fault in a UM system as a “minor” page fault. It would be distinguished from a “major” page fault, wherein the page is currently not resident and mapped to another processor’s memory space. In that case, the page fault results in a copying of data.

A similar page-fault-with-no-copy scenario can be identified in the profiler, if the memory is allocated e.g. with cudaMallocManaged, and the first processor to touch it (i.e. write to it) is the GPU.

This is just my opinion/$0.02. I’m not suggesting that this is in any way standard terminology. I merely offer this as a view point or discussion point.

Thanks for the reply.

I understand your answer, though, you might help me with a concrete example:
How I understood the process of the migration handler, is that it is considered a page-fault when a copying of data takes place (host->device or vise-versa). I have a code that this type of back and forth occurs, but when I run the program with the /usr/bin/time tool to see page faults, many minor page-faults are recorded, but it shows there are 0 major page-faults.

That’s why I think that this is considered, at least by the time tool (don’t know how it classifies these faults), a minor page-fault.

Do you think it is even correct to use the time tool to check that?

Thanks again!

There are 0 major page faults because with respect to that tool, a major page fault involves copying the unmapped page from disk. That never happens with a UM page fault.

That’s what I thought.
So it seems that the recorded page-faults in relation to the page-migration engine handler are minor ones, with respect to the time tool.

Thanks, Robert!