Could the `ldmatrix` re-written in C++ code?

For example, I know

    asm volatile("ldmatrix.sync.aligned.m8n8.x4.shared.b16 {%0, %1, %2, %3}, [%4];\n"
        : "=r"(dst.x), "=r"(dst.y), "=r"(dst.z), "=r"(dst.w) : "r"(ptr));

will collectively load matrix in warp. Coud I re-write in C++ to behave same like this inline PTX?

Glad to have your suggestions.