Inputs of Transducer from Nemo

Hi, I’m trying to implement ASR inference for onnxruntime and tensorrt using conformer - transducer model. I read model description but don’t understand some points… Could you please explain me purposes of inputs and outputs of transducer?
What is TARGETS input? Is it Joint part of network?
What is ENCODER_OUTPUTS? Is this tensor of token’s probabilities from encoder?
Why do I need INPUT_STATES_1, INPUT_STATES_2 and OUTPUT_STATES_1, OUTPUT_STATES_2?