GeForce GTX 1080 V411.63 driver crash ---> CL_INVALID_COMMAND_QUEUE on finishQueues & flushQu...

Good morning developers,
Wish you are well and having a good day already. As a newbie to Nvidia and OpenGL/OpenCL development, I would like to request your expertise & insight.

I have inherited a prediction & analysis multithreaded application using:

  • Java 1.8.0_162 (latest)
  • Intel® Core™ i7-6700K CPU @ 4.00GHz #1
  • 2 x GeForce GTX 1080 V411.63 (latest) or AMD RadeonHD6990s graphic cards

Testing


AMD cards and drivers

  • Core i7-6700K CPU *** ONLY *** —> app succeeds.
  • Using 1 or 2 RadeonHD6990 as additional computational resources —> app succeeds!

Nvidia cards and drivers

  • Core i7-6700K CPU *** ONLY *** —> app succeeds.
  • Using 1 or 2 GeForce 1080 GTXs as additional computational resources —> driver fails!
  • GTX driver crashes calling finishQueues & flushQueues with CL_INVALID_COMMAND_QUEUE.
  • Driver crash causes JRE to crash.
  • GTX driver recovers.
  • Application requires a restart (expected).

I had others try in other environments:

  • PCs WITHOUT Nvidia drivers —> app succeeds.
  • PCs WITH Nvidia cards & drivers —> driver fails!

Nvidia support insists it is an application bug! I suspect it is a Nvidia driver issue as I CANNOT reproduce the crash when Nvidia is out of the picture!

Questions:


  • Is anyone having a similar issue?
  • What do the Gods of Graphics Computations recommend I try?

More info (snippet, will NOT execute):





try {

// eh: 09-21-18 debug
// com.jogamp.opencl.CLException$CLInvalidCommandQueueException: can not finish command queue [error: CL_INVALID_COMMAND_QUEUE]

Object future = commandQueuePool.invokeAll(taskList);
commandQueuePool.finishQueues();
commandQueuePool.flushQueues();


physicsEngine.writeMessage(“Found a total of " + numDirect + " direct contacts and " + numGap + " interactions”);

if(progressListener != null) progressListener.analysisComplete();

} catch (Exception e) {
e.printStackTrace();
}

The crash happens on commandQueuePool.finishQueues():

com.jogamp.opencl.CLException$CLInvalidCommandQueueException: can not finish command queue [error: CL_INVALID_COMMAND_QUEUE]
at com.jogamp.opencl.CLException.checkForError(CLException.java:73)
at com.jogamp.opencl.CLCommandQueue.finish(CLCommandQueue.java:1795)
at com.jogamp.opencl.util.concurrent.CLCommandQueuePool.finishQueues(CLCommandQueuePool.java:155)
at com.gcas.aces.analysis.GPUMeshAnalyzer.performNarrowPhaseProximityAnalysis(GPUMeshAnalyzer.java:163)
at com.gcas.aces.analysis.PhysicsEngine.computePartInteractions(PhysicsEngine.java:131)
at com.gcas.aces.gui.ACESMain$75.run(ACESMain.java:2218)


I am not a driver developer & this is the 1st time I am looking at CLCommandQueue.
Is this the correct/proper way to make these calls?
I cannot verify it searching online!

Object future = commandQueuePool.invokeAll(taskList);
commandQueuePool.finishQueues();
commandQueuePool.flushQueues();

Happy today,
Eamen