HDMI output video/audio synchronization

Hello,

We are measuring the hdmi output of jetson nano and we have noticed that audio and video aren’t in sync. Video clock drifts in relation to audio clock. We have checked that audio clock and cpu clock are synchronized, but Video clock is not. PLLD2 is configured to get a pixel clock of 27MHz/74.25MHz/148.5MHz (depending on the output resolution) and there’s no residue after division that could result in an incorrect pixel clock.

And more surprising for us is the fact that with the same pixel clock we get different drifts for different output formats: 720p50 drifts more than 720p60, but the same as 1080p50, despite having different pixel clocks.

Does anyone knows why does this happen? Are there any other config that we can check?

Best regards, Dani.

Hi,

Is this issue reproducible on devkit?

Hi,

yes, it is. We have tested with the JP4.6.1.

We use gstreamer to test the hdmi output, and measure the clocks using two methods with the same results:

  • debug messages
  • a gray/tone detector connected to an oscilloscope.

We run two pipelines. The first one is for video. We want the sink to consume buffers at the hdmi video port rate, so we use sync=0. If sync=1 gstreamer feed the sink waiting for the pipeline clock to reach the instant corresponding to the timestamp of the buffer. With sync=0 gstreamer ignore timestamps and is always trying to feed the sink. In this case the driver is responsible of make it wait until the frame can be written to the port. We have tested nvdrmvideosink and nvoverlaysink and both have the same behaviour. For the source we use a modified videotestsrc in wich the blink pattern insert the blink frame every N (configurable) frames instead of one blink every two frames.

The second one is for audio and it use alsasink connected to tegrahda (device=hw:tegrahda,3). In this case we use sync=0 too for the same reasons, but in this case it could be ommited because gstreamer use alsasink buffer consumption rate as the clock for the pipeline. In any case, when we measure this rate against the cpu clock we see that they are synchronized: if the port has consumed the samples corresponding to N seconds of audio, the cpu clock has advanced N seconds too. For the source we use audiotestsrc with ‘wave=ticks’.

We use the debug messages of gstreamer to measure the difference between the times in wich a buffer is sen’t to the video/audio device. This method has a lot of jitter so we average a lot of measures to get a better result.

Also, connected to the hdmi output, we have a device that sets a gpio to 1 when the blink frame is detected, and another gpio when the tone is detected. We connect the gpios to an oscilloscope to see the audio/video synchronization.

In each case we see a drift between audio and video between -2.5ppm and -3ppm.

NOTE: In the original post I said that it was different for 720p60 and 720p50 but it was a mistake in the way we were measuring.
NOTE 2: You may need to customize the edid for the hdmi output to be able to configure those video formats.

This is an example of the pipelines and an awk script to calculate the periods at which video/audio are rendered (I have removed the property blink-n-frames because this is only for our modified version of videotestsrc for the oscilloscope measure):

#!/bin/bash
# First change hdmi output format to 720p50:
echo 1 > /sys/class/graphics/fb0/blank
echo 2 > /sys/class/graphics/fb0/blank
echo U:1280x720p-50 > /sys/class/graphics/fb0/mode
echo 1 > /sys/class/graphics/fb0/blank
echo 0 > /sys/class/graphics/fb0/blank

# timeout: 1200 seconds / 20 minutes
# chrt: change priority to realtime to try minimize jitter
# videotestsrc 128x72 to reduce videotestsrc cpu load. Later it's scaled to 1280x720 using nvvidconv.
timeout 1200 chrt -r 50 gst-launch-1.0 audiotestsrc wave=ticks samplesperbuffer=2400 ! audio/x-raw,channels=2,rate=48000,format=S16LE ! alsasink sync=0 buffer-time=100000 latency-time=50000 device=hw:0,3 \
   videotestsrc pattern=blink ! video/x-raw,format=I420,width=128,height=72,framerate=50/1 ! nvvidconv ! "video/x-raw(memory:NVMM),format=I420,width=1280,height=720,framerate=50/1" ! queue ! nvoverlaysink sync=0 \
 --gst-debug=*basesink:5 2>&1 | grep "rendering obj" |  awk -v CONVFMT=%.17g '

function convert_time(t) {
  split(t,tt,":")
  return tt[3]+60*tt[2]+3600*tt[1];
}
 
BEGIN {
  v_t0 = 0;
  a_t0 = 0;
  v_total = 0;
  a_total = 0;
	v_buffers_sec = 50;
	a_buffer_sec = 20;
  v_drop = v_buffers_sec;
  a_drop = a_buffer_sec;
  v_ideal = 1.0/v_buffers_sec;
  a_ideal = 1.0/a_buffer_sec;
};

/nvoverlaysink/ {
  if(v_drop==0)
  {
    t1=convert_time($1);
    if(v_t0!=0)
    {
      v_avg=(t1-v_t0)/v_total
      v_deviation=(1000000*(v_avg-v_ideal))/v_ideal
    }
    else
    {
      v_t0=t1
    }
    v_total++;
    
    print "V: avg: " v_avg " deviation(ppm): " v_deviation 
  }
  else
  {
    v_drop=v_drop - 1;
  }
}; 

/alsasink0/ {
  if(a_drop==0)
  {
    t1=convert_time($1);
    if(a_t0!=0)
    {
      a_avg=(t1-a_t0)/a_total
      a_deviation=(1000000*(a_avg-a_ideal))/a_ideal
    }
    else
    {
      a_t0=t1
    }
    a_total++;
    
    print "A: avg: " a_avg " deviation(ppm): " a_deviation 
  }
  else
  {
    a_drop=a_drop - 1;
  }
};
'

Example output:

A: avg: 0.049999999679964872 deviation(ppm): -0.0064007026068235717
V: avg: 0.01999994374159221 deviation(ppm): -2.8129203894985966

Best regards, Dani.

Hi,

we have made more tests and we have found that if we configure a pixelclock that doesn’t require the fractional part of the plld2 (i.e. if we can disable the fractional divider with PLLD2_EN_SDM=0), the drift between audio and video does not happen. It seems that the fractional part of the pll is no precise enough.

If the pixel clock doesn’t allow to disable de fractional part, there is less drift when the input divider and the multiplier are bigger (MDIV and NDIV) (i.e CF is smaller and FVCO bigger).

Finally, we have decided to modify the kernel to continuously measure the time between calls to tegra_dc_continuous_irq to detect the deviation of plld2 relative to the desired frequency, and modify plld2 SDM to tune it.

Best regards, Dani.