OptiX Bug? crash with CUDA error: Kernel ret (700) when not rtPrinting anything (small demo code)

Hi,
I have a crash with the following exception:

This is the sourcecode of the cuda file (edit: cpp file see below):

#include <optix_world.h>
struct BiDirSubPathVertex {bool existing;};
using namespace optix;

rtCallableProgram(void, sampleLightPath, ());
rtCallableProgram(void, sampleEye, ());
rtDeclareVariable(uint2,         launch_index, rtLaunchIndex, );
rtBuffer<float4, 2>              output_buffer;
RT_CALLABLE_PROGRAM void sampleLightPath_f() {}
RT_CALLABLE_PROGRAM void sampleEye_f() {}

RT_PROGRAM void pathtrace_camera() {
    BiDirSubPathVertex lightVertices[2];
    lightVertices[0].existing = false;
    lightVertices[1].existing = false;

    for(unsigned int i=0; i<2; i++) {
        sampleLightPath();
        if(!(lightVertices[i].existing)) break;
    }
//    rtPrintf("cztery\n");
    sampleEye();

    output_buffer[launch_index] = make_float4(1.f, 1.f, 1.f, 1.f);
}

RT_PROGRAM void exception()
{
    rtPrintExceptionDetails();
    output_buffer[launch_index] = make_float4(1.f, 1.f, 0.f, 0.0f);
}

You see the rtPrintf? if the comment signs are removed, the program doesn’t crash. So while the crash is simple to work around in this specific place, it would be hard if the print is not already there.

The original code file was about 700 lines long, the functions had parameters and traced rays. While removing more and more code, it always depended only on this one rtPrintf whether it crashed or not. In the original code I had exceptions and printing enabled, but it didn’t make any difference.

I verified the crash on Win7 64 (vs12 compiler, nvidia driver around 336 whql) and OpenSuse Linux 13.1 64 (gcc 4.8, nvidia driver 331.49), both systems had Cuda 5.5 and Optix 3.5 installed. the workstation has an AMD quadcore and a GeForce GTX 550Ti.

edit:

additionally verified on a Win7 64bit, vs12 compiler, nvidia driver 332.76, cuda 5.0 and Optix 3.0. the workstation has a intel xeon quad core and quadro 2000 graphics.<<

minimal example: http://xibo.at/meine/optixCrashBugPrintMinimalExample.zip new file with less code
It’s based on sutil, the same build steps as in the optix examples are necessary.

Is anybody able to reproduce?

thanks,
adam

Not sure if this will help in your particular case, but I have seen rtPrint cover up unrelated memory corruption problems.

memory corruption on host or device side?

yes, I was thinking of memory corruption all the time. That’s also why I shortened the program to these 31 lines, the host side, apart from sutil is also just 65 lines long.

If something, it could be memory corruption on the host side inside sutil. maybe I should stop using this behemoth.

edit:
I got rid of the SampleScene class. sutilSamplesPtxDir() is now the only sutil function I’m calling and still the same behaviour.
here is the updated full code: http://xibo.at/meine/optixCrashBugPrintMinimalExample2.zip

edit2:
now the minimal project contains 2 source files (apart from sutil, cmake etc). the cuda file is posted already above, the other is here (I also updated the zip):

#include <optixu/optixpp_namespace.h>
#include <sutil.h>
#include <stdlib.h>
#include <string.h>

const char* const ptxpath( const std::string& target, const std::string& base ) {
  static std::string path;
  path = std::string(sutilSamplesPtxDir()) + "/" + target + "_generated_" + base + ".ptx";
  return path.c_str();
}

int main( int argc, char** argv ) {
  try {
    optix::Context context = optix::Context::create();
    context->setEntryPointCount( 1 );

    optix::Buffer buffer = context->createBuffer( RT_BUFFER_OUTPUT, RT_FORMAT_FLOAT4, 512, 512);
    context["output_buffer"]->set(buffer);

    optix::Program exceptionProgram = context->createProgramFromPTXFile(ptxpath("helsinki", "BiDirCamera.cu"), "exception");
    optix::Program ray_gen_program = context->createProgramFromPTXFile(ptxpath( "helsinki", "BiDirCamera.cu" ), "pathtrace_camera");
    ray_gen_program["sampleLightPath"]          ->set(context->createProgramFromPTXFile(ptxpath( "helsinki", "BiDirCamera.cu" ), "sampleLightPath_f"));
    ray_gen_program["sampleEye"]                ->set(context->createProgramFromPTXFile(ptxpath( "helsinki", "BiDirCamera.cu" ), "sampleEye_f"));

    context->setRayGenerationProgram(0, ray_gen_program);
    context->setExceptionProgram(0, exceptionProgram);
    context->validate();
    context->compile();
    context->launch(0, 512, 512);
  } catch( optix::Exception& e ){
    sutilReportError( e.getErrorString().c_str() );
    exit(1);
  }

  return 0;
}

still the same behavior.

edit3: made the code even shorter for the post. tested but zip not updated.

I had the chance to test the program on one of my universities workstations. again, it’s the same behaviour.

these are the specs:
Win7 64bit, vs10 compiler, nvidia driver 332.76 whql, cuda 5.0 and Optix 3.0. the workstation has an intel xeon quad core and quadro 2000 graphics.

are there actually any OptiX developers on this forum? any chance of this being investigated?

or, anybody seeing a possibility for corrupting the stack?

Not that it’s of much help but I can confirm the issue. Specs: Win 8.1 x64, VS2012, driver 337.88, Cuda 5.5, Optix 3.5.1, Intel i7 4770K, GTX770.

I also have same these exceptions:

OptiX Error: 'Unknown error (Details: Function “_rtContextLaunch2D” caught exception: Encountered a CUDA error: cudaDriver().CuMemcpyDtoHAsync( dstHost, srcDevice, byteCount, hStream.get() ) returned (700): Illegal address)

and:
OptiX Error: 'Unknown error (Details: Function “_rtContextLaunch2D” caught exception:
Encountered a CUDA error: cudaDriver().CuMemcpyDtoHAsync( dstHost, srcDevice, byteCount, hStream.get() ) returned (716): Misaligned address)

rtContextLaunch2D does the kernel launch. but why then a writeback device to host (CuMemcpyDtoHAsync) occurs? Normally you run the kernel and results remain on the GPU. Only when desired a “download” from GPU to CPU can be requested. What is written to host there?

What is written to host is a status byte (or couple of bytes) indicating whether the launch succeeded. When the launch crashes you get the error you pasted above – it is very generic and does not necessarily have anything to do with rtPrintf.

@dlacewell
Thank you very much for this clarification.

So actually “CuMemcpyDtoHAsync” itself did not crash with code 716; it only reports, that during kernel execution a mis-alignment occured.
Is there somewhere a documentation about all the cases, when mis-alignment can happen?
I found http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#vector-types
Are there more guidelines?

A misaligned address is another very generic error. It occurs when you’re reading from an unexpected memory location that doesn’t satisfy certain conditions for the read instruction, and it usually indicates an error in user code. It is very roughly analogous to a segfault in host code.

I would follow Detlef’s advice on the other thread to try and narrow it down, rather than continuing to cross-post here on this thread.

I’m getting crash when I try to render in Maya Arnold with my GTX 1660 Ti GPU

file -f -new;
// untitled // 
// Warning: Panel size cannot accommodate all requested Heads Up Display elements. // 
doOpenFile ("C:/Users/adkga/Desktop/Elyssa Scene.ma");
file -f -options "v=0;p=17;f=0"  -ignoreVersion  -typ "mayaAscii" -o "C:/Users/adkga/Desktop/Elyssa Scene.ma";addRecentFile("C:/Users/adkga/Desktop/Elyssa Scene.ma", "mayaAscii");
evalDeferred "shaderBallRendererMenuUpdate";
// Warning: file: C:\Users\adkga\Documents\maya\2022\prefs\filePathEditorRegistryPrefs.mel line 4: filePathEditor: Attribute 'aiImage.filename' is invalid or is not designated 'usedAsFilename'. // 
// Warning: file: C:\Users\adkga\Documents\maya\2022\prefs\filePathEditorRegistryPrefs.mel line 5: filePathEditor: Attribute 'aiPhotometricLight.aiFilename' is invalid or is not designated 'usedAsFilename'. // 
// Warning: file: C:\Users\adkga\Documents\maya\2022\prefs\filePathEditorRegistryPrefs.mel line 7: filePathEditor: Attribute 'aiVolume.filename' is invalid or is not designated 'usedAsFilename'. // 
// Warning: line 1: filePathEditor: Attribute 'aiVolume.filename' and label 'VDB' have been saved already. // 
// Warning: line 1: filePathEditor: Attribute 'aiImage.filename' and label 'Image' have been saved already. // 
// Warning: line 1: filePathEditor: Attribute 'aiPhotometricLight.aiFilename' and label 'IES' have been saved already. // 
import arnold
# Successfully imported python module 'arnold'
import mtoa
# Successfully imported python module 'mtoa'
import mtoa.cmds.registerArnoldRenderer;mtoa.cmds.registerArnoldRenderer.registerArnoldRenderer()
# Successfully registered renderer 'arnold'
updateRenderOverride;
// File read in  14.4 seconds.
commandPort -securityWarning -name commandportDefault;
onSetCurrentLayout "Maya Classic";
updateRendererUI;
// Error: file: G:/Program Files/Autodesk/Maya2022/scripts/others/hikCharacterControlsUI.mel line 323: Object 'hikCharacterList|OptionMenu' not found. // 
// Error: file: G:/Program Files/Autodesk/Maya2022/scripts/others/hikCharacterControlsUI.mel line 323: Object 'hikCharacterList|OptionMenu' not found. // 
// Loading Bifrost version 2.2.1.0-202102081428-b33b49f
// Bifrost: Loading library: Amino, from: Autodesk.
// Bifrost: Loading library: AminoMayaTranslation, from: Autodesk.
// Bifrost: Loading library: bif, from: Autodesk.
// Bifrost: Loading library: bifrostObjectMayaTranslations, from: Autodesk.
// Bifrost: Loading library: geometries, from: Autodesk.
// Bifrost: Loading library: fluids, from: Autodesk.
// Bifrost: Loading library: particles, from: Autodesk.
// Bifrost: Loading library: file, from: Autodesk.
// Bifrost: Loading library: midgard, from: Autodesk.
// Bifrost: Loading library: modeling, from: Autodesk.
// Bifrost: Loading library: nucleus, from: Autodesk.
// Bifrost: Loading library: simulation, from: Autodesk.
// Bifrost: Loading library: riv_types, from: Autodesk.
// Bifrost: Loading library: riv, from: Autodesk.
// Bifrost: Loading library: scatter_pack, from: Autodesk.
// AbcExport v1.0 using Alembic 1.7.5 (built Nov 30 2020 18:40:46)
// AbcImport v1.0 using Alembic 1.7.5 (built Nov 30 2020 18:40:46)
// Error: line 0: Cannot find procedure "shelf_Bifrost". // 
// Warning: The shelf "Bifrost" has items that cannot be read. // 
// Error: [gpu] an error happened during rendering. OptiX error is: Unknown error (Details: Function "_rtContextLaunch2D" caught exception: Encountered a CUDA error: cudaDriver().CuEventSynchronize( m_event ) returned (716): Misaligned address, file: <internal>, line: 0)
	GPU 0 had 2828MB free before rendering started and 1219MB free when crash occurred
	GPU errors are sometimes due to a GPU not having enough remaining free memory. To see if this is what happened here, try simplifying your scene or running on a GPU with more free RAM to see if it solves the crash. Otherwise, upgrading to the latest nvidia gpu driver and Arnold core (available from www.arnoldrenderer.com) might fix the crash // 

Please raise this end-customer issue with the Autodesk Arnold support instead.

Did you follow the advice in the error message you posted, especially the part about updating graphics drivers and Arnold core?

(This is an OptiX developer support forum which is the wrong place and especially the wrong thread to report such end-customer issues.)