Driver issue (indirect buffer sync)

Hi, I found an issue in current driver using DrawInstancedIndexedIndirect using a DYNAMIC (filled by the CPU) Indirect buffer.
Moving some CPU code to GPU, I prepared the job by filling on the CPU the buffers the way they will be used after moving all the setup code to the GPU (ComputeShader)

Using a buffer flaged as IndirectArgs, Dynamic, CPUWrite.
Map(buffer) → fill(CPU) → Unmap(buffer)
bind this buffer as IndirectArgs and issuing a DrawInstancedIndexedIndirect (using valid data) results in total garbage on NVidia (runs fine on AMD).

doing the same in a basic Dynamic buffer, then using CopyResource with a buffer flaged as Indirect as destination, then using this copy as IndirectArgs runs fine (both NVidia and AMD)

I can’t send you a repro (confidential big engine with huge dataset)
I can give you more details by email (ronan.bel@ubisoft.com)
(I’m using another email here, can’t recover my old password, the capcha box hides the second line of the email confirmation in the lost password form on my browser …)

Hi, this issue occurs with 355.98, and is still present with 358.50
(nvidia 770)(win7 & win8.1)

sample (pseudo-code)

// map
DrawIndexIndirectArgs * pIndirect = (DrawIndexIndirectArgs *) context->Map(m_HiTilesBuffer_indirect, ResourceMap::Discard).m_lockedData;

// fill the buffer (16*8 instances)
pIndirect->m_indexCountPerInstance = tMgr->getNumIndex(iLod, iEdge);
pIndirect->m_instanceCount = hiTilesViewData.m_instancesPerLodCount[iCubeFace][iEdge][iLod];
pIndirect->m_startIndexLocation = tMgr->getStartIndex(iLod, iEdge);
pIndirect->m_baseVertexLocation = tMgr->getStartVertex(iLod);
pIndirect->m_startInstanceLocation = hiTilesViewData.m_instancesPerLodOffsetOriginal[iCubeFace][iEdge][iLod];

// unmap
context->Unmap(m_HiTilesBuffer_indirect);

#if defined(D_NVIDIA_IndirectDynamic)
context->CopyResource(m_HiTilesBuffer_indirect2, m_HiTilesBuffer_indirect);
context->SetIndirectArgsBuffer(m_HiTilesBuffer_indirect2);
#else // !defined(D_NVIDIA_IndirectDynamic)
context->SetIndirectArgsBuffer(m_HiTilesBuffer_indirect);
#endif // !defined(D_NVIDIA_IndirectDynamic)

// fake MultiDrawIndirect on D3D11 (real on PS4)
context->MultiDrawIndexedInstancedIndirect( 0, 16*8 );

Hi, still present with 361.43 driver …