MDL Init function crashes for a subset of materials

Hello, we are using MDL SDK to generate material ptx for out OptiX application.

Issue is, that, for a limited subset of materials from the official vMaterials library cause a crash within the Init function, while most of the rest work just fine.

Here are more details about our configuration:

Execution Context - Default settings;

Api Backend (MB_CUDA_PTX):

  • num_texture_spaces - 1;
  • num_texture_results - 16 (tried 0 too and in that case PDF, Eval or Sample functions crash instead of Init, whichever is invoked first);
  • texture_runtime_with_derivs - “off”;
  • enable_ro_segment - “on” (tried with “off” too and it did not effect the behavior);
  • sm_version - 50;
  • texture_runtime_with_derivs - “off”;
  • tex_lookup_call_mode -“direct_call”;
  • lambda_return_mode - “value”;
  • inline_aggressively - “on” (tried with “off” too and it did not effect the behavior);

Functions that we are using are limited to:

  • surface_scattering_init (This is the one crashing; tried global init as an alternative, but it did not make a difference);
  • surface_scattering_sample;
  • surface_scattering_evaluate;
  • surface_scattering_pdf;
  • cutout_opacity;

Tried compiling the material using both mi::neuraylib::IMaterial_instance::DEFAULT_OPTIONS and mi::neuraylib::IMaterial_instance::CLASS_COMPILATION with no difference in behavior (in both crashing and non-crashing cases)

Our integrator invokes the functions as OptiX direct-calls using something like optixDirectCall(InitId, mi::neuraylib::Shading_state_material*, mi::neuraylib::Resource_data*, char*)

Texture support functions are compiled alongside the integrator and are linked automatically during OptiX-pipeline creation. Materials that do not crash display the textures correctly in our output, meaning the functions work correctly; Having said that, we also temporarily mocked the bodies of the texturing functions just in case, with no change in behavior;

Parameters, passed specifically to Init function are as follows:
mi::neuraylib::Shading_state_material

  • normal, geom_normal - float3 normal value; normalized;
  • position - world-space position;
  • animation_time - 0;
  • text_coords - pointer of a single float3, holding {u, v, 0} value;
  • tangent_u - pointer of a single float3 tangent value;
  • tangent_v - pointer of a single float3 bitangent value;
  • text_results - array of 16 float4 values (raw pointer to data, directly);
  • ro_data_segment - targetCode->get_ro_data_segment_data(0u) if present; nullptr otherwise; We have an explicit check that makes sure get_ro_data_segment_count is 0 or 1;
  • object_to_world, object_to_world - Have tried providing nullptr, as well as a float4[4u], storing identity matrix as it’s contents; There’s no difference in behavior (I know, the last row is not needed, but decided to “play safe” here);
  • object_id - Tried 0 and 1;
  • meters_per_scene_unit - 1;

mi::neuraylib::Resource_data: {null, Texture_handler}, where Texture_handler is our struct, derived from Texture_handler_base; vtable is empty, we are using direct_call mode and OptiX would not allow function pointers anyway;

mdlArgBlock: targetCode->get_argument_block(0)->get_data() for CLASS_COMPILATION, nullptr otherwise; We validate on CPU-side that (targetCode->get_argument_block_count() <= 1);

There are no light-profiles and bsdf-measurements present and we are checking those;

One example of a material that crashes is

vMaterials_2/Metal/Aluminum_Anodized.mdl (Aluminum_Anodized)

Example of a material that DOES NOT crash and renders correctly is

vMaterials_2/Metal/Brass_Brushed.mdl (Brass_Brushed)

If requested, we can provide a larger list of materials that crash and the ones that work;

Used Nsight for debugging and the breakpoint is hit inside the Init functon with invalid memory access error; as stated above, if I set num_texture_results to 0, the same error occurs in eval/pdf/sample functions instead; setting object_to_world and object_to_world matrices to nullptr or identity-reference only effects the assembly line hit, nothing else.

Any suggestions about what we might have overlooked will be appreciated :D

Hi TheDonsky!

Thanks a lot for the very detailed report and all the experiments you already tried to find the problem!

I tried to reproduce the problem with the “optix7” MDL SDK example and Aluminum_Anodized using the latest MDL SDK 2024.0.2, but the material renders fine for me.
Which MDL SDK are you using? Which OptiX version are you using?

You listed a number of “used” functions, but for which expressions are you compiling the code? Just “surface.scattering” and “geometry.cutout_opacity” or anything else?

To have a sufficiently compatible setting, I applied these settings and changes to the optix7 example:

  • Use the --use-direct-call command line option to use OptiX direct callables
  • Add geometry.cutout_opacity
  • (“enable_ro_segment” is off, but you said, you also tried that. ro_data_segment is nullptr)
  • I set “sm_version” to 50

The generated code uses the object_to_world and world_to_object matrices. The optix7 example provides pointers to local float4[3] variables. Note: Brass_Brushed does not access these fields.

The example also uses class compilation.

Can you please provide the PTX code for the Init function and the line where the crash happens?

Can you reproduce the crash with the optix7 example (“optix7.exe --use-direct-call -p ::nvidia::vMaterials_2::Metal::Aluminum_Anodized::Aluminum_Anodized”)?

Best regards
Moritz

Hello @mkroll . Thanks a lot for your feedback!

Just by reading the documentation, I mistakenly assumed the transformation matrices were optional and when I run the tests with them set to identities and still got a crash, I did not pay close enough attention to the fact that I had a typo and was setting object_to_world matrix twice, leaving world_to_object empty.

Your comment informed me more about that input’s importance; changed matrices and got the scene rendered just fine :D

Thanks again for nudging us in the right direction!

You’re very welcome!
I’m happy, it’s working now! :D

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.