some question in libgdsync about CUstreamBatchMemOpParams

Hi, ALL.
I am studying the code of libgdsync recently and find a structure CUstreamBatchMemOpParams not the same as in cuda 8.0.61.
In my cuda version 8.0.61, it defined like this in cuda.h as below

typedef union CUstreamBatchMemOpParams_union {
    CUstreamBatchMemOpType operation;
    struct CUstreamMemOpWaitValueParams_st {
        CUstreamBatchMemOpType operation;
        CUdeviceptr address;
        union {
            cuuint32_t value;
            cuuint64_t pad;
        };
        unsigned int flags;
        CUdeviceptr alias; /**< For driver internal use. Initial value is unimportant. */
    } waitValue;
    struct CUstreamMemOpWriteValueParams_st {
        CUstreamBatchMemOpType operation;
        CUdeviceptr address;
        union {
            cuuint32_t value;
            cuuint64_t pad;
        };
        unsigned int flags;
        CUdeviceptr alias; /**< For driver internal use. Initial value is unimportant. */
    } writeValue;
    struct CUstreamMemOpFlushRemoteWritesParams_st {
        CUstreamBatchMemOpType operation;
        unsigned int flags;
    } flushRemoteWrites;
    cuuint64_t pad[6];
} CUstreamBatchMemOpParams;

but in the code of libgdsync, it infers a member named “inlineCopy” which is not exist in my CUstreamBatchMemOpParams.

static int gds_fill_inlcpy(CUstreamBatchMemOpParams *param, CUdeviceptr addr, void *data, size_t n_bytes, int flags)
{
        int retcode = 0;
#if GDS_HAS_INLINE_COPY
        CUdeviceptr dev_ptr = addr;

        assert(addr);
        assert(n_bytes > 0);
        // TODO:
        //  verify address requirements of inline_copy
        //assert((((unsigned long)addr) & 0x3) == 0); 

        bool need_barrier       = (flags  & GDS_IMMCOPY_POST_TAIL_FLUSH  ) ? true : false;

        param->operation = CU_STREAM_MEM_OP_INLINE_COPY;
        param->inlineCopy.byteCount = n_bytes;
        param->inlineCopy.srcData = data;
        param->inlineCopy.address = dev_ptr;
        param->inlineCopy.flags = CU_STREAM_INLINE_COPY_NO_MEMORY_BARRIER;
        if (need_barrier)
                param->inlineCopy.flags = 0;
        gds_dbg("op=%d addr=%p src=%p size=%zd flags=%08x\n",
                param->operation,
                (void*)param->inlineCopy.address,
                param->inlineCopy.srcData,
                param->inlineCopy.byteCount,
                param->inlineCopy.flags);
#else
        gds_err("error, inline copy is unsupported\n");
        retcode = EINVAL;
#endif
        return retcode;
}

So, I wonder if the cuda libgdsync using is not the cuda which has been published ? Or something I have missed there ?

Thanks for your help.

Any one knows it ?

I think libgdsync is using published CUDA. As long as GDS_HAS_INLINE_COPY is not defined it should be OK.

Is there performance loss if not define the GDS_HAS_INLINE_COPY ? And what libraries or patches should I install if using GDS_HAS_INLINE_COPY ?