Using CUDA with OpenCL leads to hung on 32 bit applications (Windows 10 x64).

Hello All,

I trying to use my application two libraries (CUDA and OpenCL 1.2) and my application hanging on 32 bit applications.
For example code:

//…
// OPENCL: create context
// CUDA: using function cudaMalloc / cudaMemcpy
// OPENCL: create command queue
// OPENCL: using clCreateBuffer / clEnqueueWriteBuffer / clEnqueueFillBuffer / …
// CUDA: using function cudaMalloc / cudaMemcpy
//…

Changing the order of calls CUDA and OpenCL API hang occurs on the methods of a library CUDA or OpenCL (for example: cudaMemcpy or clFinish), on the methods which have synchronization.

I want to add that this unpleasant effect occurs after the driver versions 350 and higher, when there was a transition from support version NVIDIA driver OpenCL 1.1 to 1.2 (in version 347.88 - methods do not hang, works fine but above the 350 version does not works).
Important Notice: x64 version of the application is working properly, It works without lags. I think that the problem in the NVIDIA driver, because when I use other OpenCL devices(Intel i7-4790K or Intel HD Graphics 4600) - no problems everything works!

About my system info:
OS: Windows 10 Pro x64
GPU: NVIDIA GeForce GTX 780Ti
CPU: Intel i7-4790K (with Intel HD Graphics 4600)

Also, I found a similar problem with MPC-HC - http://www.svp-team.com/forum/viewtopic.php?id=2671

What to do? Where to complain?

support for 32-bit application development on CUDA is gradually disappearing.

You can file a bug at developer.nvidia.com