Large Language Models Trained on NVIDIAs Omniverse Kit API?

LMTraina99 · June 28, 2023, 4:11pm

Hi!

I’ve been using Omniverse a lot over the past 18 months. The release of LLMs has assisted quite a bit with parsing large and sometimes confusing APIs like Pixar USD.

My question is: Is there a LLM that has been trained on the omniverse APIs? As a developer having a tool like that would increase my productivity greatly.

I am specifically concerned with things like omni.kit and omni.ui for extension development. I find myself spending quite a bit of time parsing documentation (and often even finding the correct documentation as kit versions update and mutate - see my other posts on the forum for reference). The developer community and Mati have been incredibly helpful in resolving many of these roadblocks I’ve encountered, but a conversational AI would be invaluable for me. Obviously the infrastructure already exists, but things like OpenAI’s ChatGPT had their public-facing model’s training data cut off back in 2021, and there have been significant updates/changes to omniverse APIs since then.

Thank you for your time!

-Matthew

paulcutsinger · June 28, 2023, 4:26pm

Wonderful idea - makes sense. Could you share some example questions that you might want to ask an LLM that’s trained in this way?

LMTraina99 · June 28, 2023, 4:51pm

Hi Paul,

Sure! Here are some questions I’ve brought to the forums or asked other experts before that a LLM would probably make quick work of:

‘Can you show me how to generate extra viewports in an Omniverse Kit extension using the kit viewport API? I would also like these viewports to be capable of popping out into their own windows outside of the currently running base kit application.’
‘In the context of an NVIDIA Omniverse Kit extension, can you show me how to add text labels on world-space object that scale consistently to screen space?’ (this one may be outside LLM capability)
‘I would like to generate an Omniverse Kit extension GUI that has the following elements and layout: (arbitrary description). Can you help me generate this using the omni.ui library?’

In addition to some of these more complex things, it would also be very helpful to have it reply with the relevant library for a given task, i.e.

‘What is the proper viewport API for working in Omniverse Kit version (arbitrary version) and is there anything special I need to do to use it as a dependency in a kit extension I’m developing?’

Omniverse has so many powerful capabilities and use cases but for things like scene manipulation I find myself relying very heavily on coding directly with the USD API even if it is more verbose or low-level than omni.usd. Pixar has very elaborate (albeit dense) documentation, and though many of the omniverse libraries do have decent docs, it seems they are more subject to version changes/mismatches between the most modern codebase and the current web-based documentation. Pixar also has some very helpful wiki-like descriptions for computer graphics theory built into some of the top level classes. That kind of description goes a long way for keeping experimenting developers engaged and not lost. A LLM (to a certain degree) can circumvent the need for such details and if trained on every kit version separately, could even resolve the version confusion issue.

If Pixar USD knowledge, Omniverse Kit knowledge, and generative AI capabilities were concatenated into one tool I think there would be some remarkable results. It would also make the platform more attractive to independent developers and beginners.

Cheers!

alan.james.kent · June 28, 2023, 5:09pm

Mati (from Discord) said the Python bindings to all the API functions currently do not include all the argument lists for functions. It won’t be ready for 105, but will be released some time after that. I suspect that will help a lot as well. Teach it the python APIs (not the C++ ones).

Its a great tool for learning. E.g. “Using the Python API, how do you bind a material to a geometry given a prim path to the material”. Sometimes you need to use UsdSkel instead of a prim node (a wrapper class) to get type safe APIs, and for Python you often just use a string instead of a token.

I would train it up on Mati’s git repo of NVIDIA code samples from his live streams as well. But I shelved trying to “fine tune” a ChatGPT model until all the function prototypes were fully available. After that, I think this would be a fantastic resource for learning the APIs.

“How do I load a model and make sure that it is rotated upwards and scaled to the same scale factor as the rest of the project?”
“What is the “st” property?”
“Write python code to retarget an animation clip from one character to another”
“How can I create a sequencer from python, add the UsdSkelRoot under /World/Characters/Sam to an animation track, then add the animation clip for walking from the Hank character to Sam.”

paulcutsinger · June 28, 2023, 5:10pm

Thanks, this is a great list. I’ll take this back to team and explore it.