The illegal memory access error on cudaMemcpy function

Hello, I am confused now because of an error in my OPTIX code.

My code launchs rays using optixLaunch function below.

OPTIX_ERROR_CHECK(optixLaunch(pipeline, cudaStream, (CUdeviceptr)launchParamsBuffer, (size_t)(sizeof(launchParams)), &sbt, launchWidth, 1, 1));

This optixLaunch function figures out and stores information on whether some rays hit any mesh in the device memory.

And after optixLaunch function is finished, this information in the device memory is delivered to the host memory by cudaMemcpy function below.

CUDA_ERROR_CHECK(cudaMemcpy(reinterpret_cast<void>(hostPointer), reinterpret_cast<void>(launchParams.devicePointer), deviceMemorySize * sizeof(bool), cudaMemcpyDeviceToHost));**

This works well for a small number of launchWidth.

However, when the launchWidth becomes larger than about 500 million, the cudaMemcpy function show me an error: FATAL ERROR: An CUDA error has occurredcudaErrorIllegalAddress: an illegal memory access was encountered

Furthermore, after optixLaunch function is finished, not only cudaMemcpy function that transfers device memories written by the optixLaunch function, but also any other cudaMemcpy function that transfers memories irrelevant with optixLaunch function show me the same error.

It seems like the optixLaunch function destroys the whole device memory.

I recognize that the maximum number of launchWidth is 2^30, so I thought 500 million launchWidth should work well.

Could you give me any advice on this problem?

Thank you.

P.S. I am using RTX A6000 GPU and the device memory is sufficient.

Hi @yongwankim,

Can you explain a little more about what’s going on? Your OptiX launch is not showing any error, but the cudaMemcpy has an error? If your launch was successful and there were no errors, then this would tend to indicate that the parameters to cudaMemcpy are incorrect, either a bad pointer, or a misaligned pointer, or a bad size. Have you checked if there is a CUDA error immediately after the launch but before calling cudaMemcpy? Have you checked if your host & device pointers were corrupted before calling cudaMemcpy? What is the actual size of the memcpy?

I see you are using a buffer of boolean values. How did you allocate and initialize this buffer? Is there a std::vector involved?

When you say other cudaMemcpy calls have an error, these calls are in the same CUDA context I assume? Does your CUDA_ERROR_CHECK macro call cudaGetLastError()? Are there any asynchronous activities happening in your CUDA context / stream?


Hi David,

Thanks to you, I double checked which function makes the error.

As a result, I have identified that precision problem of sqrt operator in optixLaunch function makes the error.

Now this error has been solved.

Thank you!

1 Like