Some questions about CGF JSON Descriptor

Please provide the following info (tick the boxes after creating this topic):
Software Version
[*] DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
[*] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
[*] DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.2.10884
other

Host Machine Version
[*] native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other
1.If the “processorTypes” for “pass” is set to “DLA” in node.json, the corresponding mapped resource is “CUDLA_STREAM” in automatically generated stm yaml file. However, according to the Nvidia-STM-Userguide.pdf, the mapped resource for DLA should be “DLA_HANDLE”.
2.The descriptionScheduleYamlGenerator.py tool may not have been updated to support the “VPA” processor type yet. It currently supports “VPI”. Are the two equivalent?
3.How are MUTEX software resources declared in app.json assigned to the corresponding passes,use ? Are there any default allocation rules?
4.We did not find any examples of using “passDependencies” in app.json. Could you please provide provide some examples?

Dear @jiangxiaoke,

Do you mean PVA accelerator?
I am checking on other queries internally and update you

Yes, it should be PVA.

Can you share the example file for CUDLA_STREAM? Can you point out where you saw “DLA_HANDLE” in Compute Graph Framework SDK Reference: Compute Graph Framework SDK Reference?

Yes, the Vision Programming Interface (VPI) is used to perform computer vision operations on the PVA (Programmable Vision Accelerator) hardware engine. Can you please share the file or document where you encountered the term “VPI”? This will help us provide more accurate information.

Can you elaborate on this question?

My example node json:

{
    "comment": "Generated by the nodedescriptor tool based on data provided by the C++ API of the node class",
    "generated": true,
    "library": "libdwcgf_helloworld.so",
    "name": "dw::framework::HelloWorldNode",
    "inputPorts": {
    },
    "outputPorts": {
        "VALUE_0": {
            "type": "int",
            "bindingRequired": true
        },
        "VALUE_1": {
            "type": "int",
            "bindingRequired": true
        }
    },
    "parameters": {
        "name": {
            "type": "std::string"
        }
    },
    "passes": [
        {
            "name": "SETUP",
            "processorTypes": [
                "CPU"
            ]
        },
        {
            "name": "PROCESS",
            "processorTypes": [
                "GPU"
            ]
        },
        {
            "name": "PROCESS2",
            "processorTypes": [
                "DLA"
            ]
        },
        {
            "name": "PROCESS3",
            "processorTypes": [
                "PVA"
            ]
        },
        {
            "name": "TEARDOWN",
            "processorTypes": [
                "CPU"
            ]
        }
    ]
}

1. I did not see any information about DLA_HANDLE in the CGF. The description of DLA_HANDLE in Nvidia-STM-Userguide.pdf is as follows:

Resource Type: CUDA Stream, DLA Handle, PVA Stream
CUDA streams, DLA handles, and PVA streams are client-specific software resources that are 
mapped to corresponding hardware engines (GPU, DLA and VPU respectively). To specify these 
resources, the resource types should be set to CUDA_STREAM, DLA_HANDLE, or 
PVA_STREAM respectively. The hardware engine mapping is conveyed to the compiler when 
specifying the resource instances as shown in the example below. The specified hardware 
resource instances should be specified under the corresponding hardware resource type in the 
Global Resources section. Note that the compiler will throw an error if the limits on the mapped 
resource (as specified in section 3.1.1.6) are violated.

  Clients:
  - Client0:
  Resources:
  CUDA_STREAM:
  - CUDA_STREAM0: GPU0 # CUDA_STREAM0 mapped to GPU0
  - CUDA_STREAM1: GPU0 # CUDA_STREAM1 mapped to GPU0
  DLA_HANDLE:
  - DLA_HANDLE0: DLA1 # DLA_HANDLE0 mapped to DLA1
  PVA_STREAM: #A client can have one unique stream per VPU
  - PVA_STREAM0: VPU0 # PVA_STREAM0 mapped to VPU0

Resource Type: Local Scheduling Mutex
Resource types other than those specified in section 3.1.1.3 above are treated as local scheduling 
mutexes. These cannot be mapped to a hardware resource. 

Clients:
- Client0:
Resources:
LOCAL_SCHED_MUTEX:
- LOCAL_SCHED_MUTEX0
LOCAL_RESOURCE_MUTEX:
- RESOURCE_MUTEX0

According to the description above, CUDLA_STREAM will be treated as the MUTEX resource. This is inconsistent with my expectations, so I am confused.

2. The description of the Pass processor type in driveworks-5.10/tools/schema/node.schema.json:

"processorTypes": {
                    "description": "The processor types used by the pass (support is limited to a single processor type atm)",
                    "type": "array",
                    "minItems": 1,
                    "maxItems": 1,
                    "items": {
                        "enum": [
                            "CPU",
                            "GPU",
                            "DLA",
                            "PVA"
                        ]
                    }

and in driverorks-5.10/tools/descriptionScheduleYamlGenerator/descriptionScheduleYamlGenerator.py:

    def __getGlobalResource(hyperepochs, machineName):

        ret = {}

        for hyperepoch, hyperepochDesc in hyperepochs.items():

            for res in hyperepochDesc.get("resources", {}).keys():

                ids = res.split(".")

                # ids[0] could be machine name, client name or resource name

                if ids[0] == machineName:

                    resourceName = ids[1]

                    if ":" in resourceName:

                        raise RuntimeError("Hardware resource name cannot be mapped")

                    resType = ScheduleDescription.__determineResourceType(resourceName)

                    if not resType in ("CPU", "GPU", "VPI", "DLA"):

                        raise RuntimeError("Only hardware resources can be specified with machine name")

I suggest upgrading the descriptionScheduleYamlGenerator.py tool and keeping it in sync with STM regarding the above two issues.

3. Mutex resources in STM are used to prevent concurrent execution of Runnables (pass in CGF) that own the same mutex resource:

         Runnables:
          ......
          - dwcgfHelloworld_helloWorldNode_pass_1:
         ......
              Resources:

              - CPU

              - CUDA_MUTEX_LOCK

          - dwcgfHelloworld_helloWorldNode_pass_2:

              Resources:
              - CPU
              - CUDA_MUTEX_LOCK

I did not find any configuration method related to mutex resources and pass association in CGF *app.json. I need to know how to precisely configure mutex resources to the relevant passes.

Dear @jiangxiaoke,

  1. Earlier when DLA is used, we had to choose one of the two engines specifically (DLA_0 vs. DLA_1). Now we have switched to only use CUDLA in all the nodes. Hence the former shouldn’t be used anymore.
  2. We have updated descriptionScheduleYamlGenerator.py to support PVA processor in next release
  3. For each cuda stream there is a corresponding cuda mutex lock which is listed as a resource in a hyperepoch.
  4. I don’t have example in hand or in documentation. I will check and update you. It basically allows to add additional scheduling dependencies to influence the graph STM uses to determine the order.

Thank you for brining the issue to our attention. We will work on updating the documents to clarify this.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.