Performance optimization for push graph

Hello,

I’m currently exploring the use of Omnigraph for scene creation, where each node represents a different scene element, such as a robot or a robot gripper. My approach is similar to what’s discussed in this post: How to create a new omnigraph environment - Platform / Kit - NVIDIA Developer Forums.

In my setup, I employ push graph nodes and handle the positioning of each object within the compute function. While this method is effective and I am pleased with the functionality, it’s also quite demanding on performance. I’ve noticed that the fps are almost halved with the addition of each new scene object.

For my purposes, high-speed node execution isn’t necessary. An update to the object position every second would be entirely sufficient.

I also want to avoid the use of ActionGraph if possible, as it would require additional connections for each node, which I prefer to streamline.

Does anyone have suggestions for performance enhancement? Is there a way to dedicate a specific thread to the push graph, so that it exclusively utilizes those resources?

Kind regards,

Axel

Hi,

The first thing that can help us determine the next step is to figure out where the time is spent on. Is it the node’s overhead or is it the actual moving object action? Does the slowdown happen when you are moving object or the moment you add additional nodes? Are you writing the nodes yourself or are you using nodes provided? If it’s a python node or a C++ node you wrote, you should be able to add some timestamps.

Also, can you provide a screenshot of your graph? so we can get a more concrete idea of the nodes and process you are using. Thanks.

Hi @qwan,

I’ve written the nodes myself in python. Each node resembles a scene object and loads the corresponding usd file on creation. Two nodes can be connected to position the objects relative to each other.
Nodes before connection:

Nodes after connection (objects are positioned relative to each other):

The relative positioning consumes the most processing power. Here is a table of the effect those nodes have on the fps:

| Number nodes | Nodes connected | Nodes not connected |
|--------------|-----------------|---------------------|
| 0            | 90 fps          | 90 fps              |
| 1            | x               | 85 fps              |
| 2            | 24 fps          | 72 fps              |
| 3            | 14 fps          | 67 fps              |
| 4            | 12 fps          | 62 fps              |
| 5            | 9 fps           | 60 fps              |

This is the compute function of the nodes:

    @staticmethod
    def compute(db) -> bool:
        # Compute the outputs from the current input

        try:
            node = db.node

            start = time.time()
            compute_positioning(node, scene_path_attribute_name="scene_path")
            end = time.time()
            print("exec_time_KIRoPro_AdhocMRK_Node", end-start)

            pass

        except Exception as error:
            # If anything causes your compute to fail report the error and return False
            db.log_error(str(error))
            return False

This is the compute_positioning function:

def compute_positioning(node_obj, scene_path_attribute_name):
    connected_flange_dic = find_connected_flange(node_obj)
    absolute_pos_attribute_name = "start_position"
    absolute_ori_attribute_name = "start_orientation"
    positional_shift_attribute_name = "positional_shift"
    orientational_shift_attribute_name = "orientational_shift"

    if connected_flange_dic is not None:
        input_flange_value = connected_flange_dic['input_flange_value']
        input_flange_name = connected_flange_dic['input_flange_name']
        prev_node_obj = connected_flange_dic['parent_node']
        child_object_scene_path = get_attribute_value(node_obj=node_obj, attribute_name=scene_path_attribute_name)
        parent_object_scene_path = get_attribute_value(node_obj=prev_node_obj, attribute_name=scene_path_attribute_name)
        parent_flange_name = input_flange_value
        child_flange_name = input_flange_name
        positional_shift = get_attribute_value(node_obj=node_obj, attribute_name=positional_shift_attribute_name)
        orientational_shift = get_attribute_value(node_obj=node_obj, attribute_name=orientational_shift_attribute_name)
        align_object_flange(child_object_scene_path, parent_object_scene_path, parent_flange_name, child_flange_name, positional_shift, orientational_shift)
    else:
        absolute_position = get_attribute_value(node_obj=node_obj, attribute_name=absolute_pos_attribute_name)
        absolute_orientation = get_attribute_value(node_obj=node_obj, attribute_name=absolute_ori_attribute_name)
        object_scene_path = get_attribute_value(node_obj=node_obj, attribute_name=scene_path_attribute_name)
        position_prim(object_scene_path, absolute_position, absolute_orientation)
                
    return

Without a connection to another node (if statement is false), this function takes 0.0005 s to 0.0015 s to compute.

With a connection to another node (if statement is true), it takes about 0.0015 s to 0.05 s to compute.

I am not really sure, how the computing is handled in omnigraph, but is it possible to somehow decouple the node execution? Maybe in a separate thread? It’s completely fine, if the positioning lags a bit behind (for example update every second), but I don’t want the rest of the system to be so strongly affected by it.

A pull graph is going to tick the graph constantly, so when you make a connection between two objects, you are basically constantly calculating their relative position every single tick, even when they are stationary. Is that what you want? or is it more like you want it to tick once when the connection is made, or if the position of one of the object has been changed?

You can test the theory by setting a variable to keep track of whether you’ve already gone through the computation once (and reset if the previous object’s position changed), and see if that helps with the frame rate. (there might be cleaner ways to do this kind of update, we are looking into it. But meanwhile, use some tracking variables can help us determine where your slowdown is).

Hi @qwan,

thank you for the feedback.

My first implementation was with action graph nodes. I subscribed to the connection events (as outlined in this post: How to create a new omnigraph environment - Platform / Kit - NVIDIA Developer Forums) and although performance was good, there were some drawbacks:

  • The on_connection event triggers only when a node’s input connection changes.
  • I couldn’t find an event that activates upon input value modification.
  • since the positions of downstream objects depend on the positions of upstream objects. When the position of an upstream object changes, the downstream objects should adjust accordingly. This is, however, not handled throgh the connection events.

Thus, when adjusting the positional coordinates, defined as an input variable of the node, no action is triggered since there’s no event for that. Updates only occur upon new connections or disconnections of the inputs.

An alternative is to bypass the connection event in actiongraph nodes, manage everything within the compute function, and employ a tick node to periodically activate the compute function every second. This approach resolves most issues but introduces an extra connection for each node through the event connection, plus the necessity of a tick node, complicating the setup.

have you tried to save the previous node’s position to an internal variable, and first check if that changed before trying to calculate the relative position?