Fast access to GPIO

Dear All,

In my project I have to connect via GPIOs on J3A1 and J3A2 pin headers to a dedicated hardware - PAL video input/output chip. To do this, I decided to write a linux kernel module. I started with registering gpios with:

int gpio_request(unsigned gpio, const char *label)

Then set its directories as input or output with:

int gpio_direction_input(unsigned gpio)
int gpio_direction_output(unsigned gpio, int value)

Then I perform a test read of 8 pins with my custom test function:

int read_vdox(void){
    int val = 0;

    val = val | (gpio_get_value(VDOX0) << 0);
    val = val | (gpio_get_value(VDOX1) << 1);
    val = val | (gpio_get_value(VDOX2) << 2);
    val = val | (gpio_get_value(VDOX3) << 3);
    val = val | (gpio_get_value(VDOX4) << 4);
    val = val | (gpio_get_value(VDOX5) << 5);
    val = val | (gpio_get_value(VDOX6) << 6);
    val = val | (gpio_get_value(VDOX7) << 7);

    return val;

(I know that if val is nonzero but not equal one, then the result value may not be correct, but this is not the case in this test).

To test the read time, I call this function in a loop 1440*625 times (total number of YCrCb pixels obtained from the PAL signal) and measure the execution time with jiffies.

start = jiffies;
for(i=0; i<1440*625; i++)
    test_vdox = read_vdox();
stop = jiffies;
msec = (stop - start)*1000 / HZ;
printk(KERN_ALERT "pins-test: VDOX value is %d, read in %d\n", test_vdox, (int)msec);

The resulting time is about 3 seconds, which is definitely too long. To achieve the task, I need to read the whole image in about 20 ms. (The code above tests only gpio read time and does not deal with details of video transfer protocol).

Is there any way for faster GPIO access? Or I do something in a wrong way?

I guess that internals of gpio_get_value can call some memory barriers, i.e. 8 instead of 1, but my dive into the kernel source code and documentation didn’t give me any details.

Looking forward for any answer.


  1. Suppose the gpio access take much time is cause by the OS scheduling. You can break down the function call from the gpio_get_value() to the tegra_gpio_get() “gpio-tegra.c”

  2. Could you give more detail what you purpose to see if any other way for it?


Thank you for suggestions.

  1. Could you give me any tip how to “include” tegra_gpio_get() to my loadable module? This is defined as static:
static int tegra_gpio_get(struct gpio_chip *chip, unsigned offset)

So as far as I am concerned this can be used only in the same compilation unit, not in a loadable kernel module.

  1. My task is to communicate with a PAL signal converter connected to TK1 with GPIOs on J3A1 and J3A2 headers. Details are defined by ITU-R BT.60 specification, but without to much details it can be described in few points:
  • Device is configured with I2C
  • There are two data input ports and one output, but let's speak now only about one input - VDOX
  • VDOX pins (there are 8 of them) are connected to TK1 GPIO Y, K, C and R ports
  • I have to read VDOX values when high value appears on CLKVDOX pin (TEGRA GPIO W5)
  • CLKVDOX is changing with frequency of 54 MHz, so pins have to be read fast enought

In my first post I described read-speed test, without interrupts on CLKVDOX.

  1. I got no idea about your first question. But you can try to expose the function by the EXPORT_SYMBOL.
  2. Other way to access the gpio is access them from the memory map IO. You can find the GPIO base from device tree and reference to the tegra-gpio.c to see how to access them.

Hi chechli,

How’s this issue going on your project now?
Have you clarified the cause and resolve the problem?



I am now working on DMA implementation, hope this will solve the problem.

For this moment I have performed experiments with external pulse generator, showing that the maximum safe frequency of GPIO interrupts in kernel module (with standard IRQ) equals 1kHz. On 10 kHz about 1% of signal rising edges were lost. This result turned me to try DMA.

I will let you know if it succeed, however I am not an expert in this topic and it takes me some time to get knowledge about such advanced techniques.

Best regards

According to GPIO access with DMA: some information about DMA implementation with GPIO can be found for RaspberryPI, but not for Jetson.

Documentation describes well DMA buffer allocation, working with PCI and ISA devices. As can be found here:
there is no DMA engine for PCIe, so all the driver has to do is buffer allocation, sending physical address to PCIe device and waiting for completion interrupt.

But how about GPIOs? Is it any DMA controller for GPIO or other way to perform the following task:

  • External device set data on 8 GPIOs and inform about this setting another GPIO (called CLKVDOX)
  • CLKVDOX toggles with frequency of 54 MHz, let say 100 000 times
  • On each CLKVDOX tick 8 bits of data is moved to the next DMA buffer and some counter is incremented
  • After receiving 100 000 CLKVDOX tics the interrupt is triggered and the driver can read the buffered data, which has size of 100 000 bytes

Is it possible or rather it should be performed with external DMA engine, sending data somewhere else than to the GPIOs?

Hi chechli
Sorry to let you know. GPIO controllers are not connected to APBDMA controller. You cannot use APBDMA for this test.