Strange return value from ioctl VIDIOC_STREAMON

We’re experience an issue where occasionally the VIDIOC_STREAMON ioctl call returns a large negative value and errno is 0. This is on jetpack 4.6. For example:

[ 8091.299611] video1: VIDIOC_STREAMON: error -970440680: type=vid-cap
[ 8091.299686] video1: VIDIOC_STREAMOFF: type=vid-cap

Looking through capture_vi_channel.c it seems like it’s failing in vi_channel_open_ex() at this point (it’s gets past the vi_channel_power_on_vi_device call just before this):

	mutex_lock(&chan_drv->lock);
	if (rcu_access_pointer(chan_drv->channels[channel]) != NULL) {
		mutex_unlock(&chan_drv->lock);
		err = -EBUSY;
		goto rcu_err;
	}

And the return value would come from PTR_ERR(chan) here:

	chan->tegra_vi_channel[vi_port] = vi_channel_open_ex(chan->id + vi_port, false);
	if (IS_ERR(chan->tegra_vi_channel[vi_port])) {
		ret = PTR_ERR(chan);
		goto err_open_ex;
	}

I’m adding dev_err calls in vi_channel_open_ex to confirm, but was wondering why this might be happening? It looks like chan_drv->channels[channel] is cleared in vi_channel_close_ex. Is it possible that’s not happening for some reason?

Does v4l2-ctl utility have the same problem?

I was able to confirm that it’s failing at the rcu_access_pointer(chan_drv->channels[channel]) != NULL check. Interestingly, Argus is still able to stream all the cameras even though v4l2 fails for one of them. From the logging, it looks like Argus doesn’t use a specific channel index for a given camera, but searches up from 0 until it finds the first one that isn’t busy. Is that correct? So Argus just skips over the channel for which v4l2 VIDIOC_STREAMON fails for one of the cameras. I’m still investigating how it happens that chan_drv->channels[channel] isn’t NULL.