tx1 decoder jpg error

Hi,
In my product,we use usb camera and set jpeg format,then use nvdecoder to decode mjpeg to rgb.
But,in one chip ,occurs error of repeated pictures.
Same time,the kernel has some error log like this:

5705 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.711180] ---- mlocks ----
5706 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.714121]
5707 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.715624] ---- syncpts ----
5708 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.718628] id 14 (57000000.gpu_507) min 64068 max 64068 refs 1 (previous client : )
5709 Jul 15 07:54:45 tegra-ubuntu rsyslogd-2007: action ‘action 9’ suspended, next retry is Mon Jul 15 07:55:15 2019 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
5710 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.726427] id 15 (57000000.gpu_506) min 166 max 166 refs 1 (previous client : )
5711 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.733886] id 16 (57000000.gpu_505) min 18 max 18 refs 1 (previous client : )
5712 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.741162] id 17 (57000000.gpu_504) min 34 max 34 refs 1 (previous client : )
5713 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.748445] id 18 (57000000.gpu_503) min 272 max 272 refs 1 (previous client : )
5714 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.755904] id 20 (57000000.gpu_502) min 45746 max 45746 refs 1 (previous client : )
5715 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.763712] id 21 (57000000.gpu_501) min 21016 max 21018 refs 1 (previous client : )
5716 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.773213] id 22 (57000000.gpu_500) min 23192 max 23192 refs 1 (previous client : )
5717 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.782782] id 23 (57000000.gpu_499) min 26332 max 26332 refs 1 (previous client : )
5718 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.792372] id 24 (57000000.gpu_498) min 8644 max 8646 refs 1 (previous client : )
5719 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.801892] id 25 (54340000.vic_CameraDecodeNod_0) min 3141 max 3141 refs 1 (previous client : )
5720 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.812789] id 29 (54380000.nvjpg_CameraDecodeNod_0) min 1048 max 1048 refs 1 (previous client : )
5721 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.824016] id 30 (57000000.gpu_497) min 2646 max 2646 refs 1 (previous client : )
5722 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.833969] id 31 (57000000.gpu_496) min 7406 max 7408 refs 1 (previous client : )
5723 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.844027] id 32 (57000000.gpu_495) min 2 max 2 refs 1 (previous client : )
5724 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.852389] id 33 (57000000.gpu_494) min 2 max 2 refs 1 (previous client : )
5725 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.860717] id 34 (57000000.gpu_493) min 27216 max 27220 refs 1 (previous client : )

The problem may occurs on function

void gpuConvertYUYVtoRGB(unsigned char *Y, unsigned char *U, unsigned char *V, unsigned char *dst,
int y_stride, int u_stride, int v_stride, unsigned int width, unsigned int height);

The pipeline wen used is uscamera(jpg) -> decoderToFd -> gpuConvertYUYVtoRGB,
yuv data after decoderToFd is ok,but after this function the image repeated in several pictures.

We have 500 chips tx1 of same codes running , from now only one have this problem,
We don’t know this is a software problem or hardware problem.

thanks!
image.bad.rar (1.07 MB)
syslog&&src&&imagebad.tar.7z (843 KB)

syslog &&
src we used &&
image bad
attached.

Hi,
Please break down which line triggers the segment fault in

void gpuConvertYUYVtoRGB(unsigned char *Y, unsigned char *U, unsigned char *V, unsigned char *dst,
		int y_stride, int u_stride, int v_stride, unsigned int width, unsigned int height);

Another suggestion is that you may get EGImage from fd and then refer to Handle_EGLImage() in

tegra_multimedia_api\samples\common\algorithm\cuda\NvCudaProc.cpp

eglFrame.frame.pPitch[0] can be d_Y so that you shall not need

cudaMalloc(&d_Y, planeSize_Y);
cudaMemcpy(d_Y, Y, planeSize_Y, cudaMemcpyHostToDevice);

The same foe d_U and d_V.

Besides, do you suspect it may be an hardware issue because it happens to specific one device?

Hi,
It does’t triggers segment fault,the output repeated in 3 images.
And i print some log, it is executing cudaMemcpy ,but the output repeated in 3 images.

Besides,the YIsMapped dstIsMapped are all not mapped!

if (!YIsMapped) {

	cudaMemcpy(dst, d_dst, planeSize * 3, cudaMemcpyDeviceToHost);
	cudaFree(d_Y);
	cudaFree(d_U);
	cudaFree(d_V);
}

Yes,it happens to one tx1 from now.

Hi,
Please dump and check if the YUYV is good when the issue happens. Maybe the camera source does not capture good images?

unsigned char* Y = (unsigned char*)buffers->planes[0].data;
unsigned char* U = (unsigned char*)buffers->planes[1].data;
unsigned char* V = (unsigned char*)buffers->planes[2].data;

<b>write_video_frame(ctx->out_file, *buffers);</b>

gpuConvertYUYVtoRGB(Y, U, V, rgbBuffer_, buffers->planes[0].fmt.stride, buffers->planes[1].fmt.stride, buffers->planes[2].fmt.stride, width_, height_);

Hi, the yuv data i dumped before this function is good while rgb data after this function is bad.
so i post this bug for this function gpuConvertYUYVtoRGB.

Hi,
For more information, is YUV data V4L2_PIX_FMT_YUYV or V4L2_PIX_FMT_YUV420M? If it is V4L2_PIX_FMT_YUYV, buffer.n_planes should be 1, not 3.

Hi,

int
NvJPEGDecoder::decodeToFd(int &fd, unsigned char * in_buf,
unsigned long in_buf_size, uint32_t &pixfmt, uint32_t &width,
uint32_t &height)

(void) jpeg_read_header(&cinfo, TRUE);

get the format : pixel_format = V4L2_PIX_FMT_YUV422M;

Hi,
do you have our code used ? we use pipeline like this,

  ctx_.jpegdec = NvJPEGDecoder::createJPEGDecoder("jpegdec");
  ctx_.conv = NvVideoConverter::createVideoConverter("conv");

and get pixmat in
SetImageInfo(jpgBuffer, jpgSize);

then dqueue buffers from conv.

Thanks!

Hi,

The code is partial. Maybe you can share full code so that we can build and run it on r28.1/TX1 to reproduce the issue? Now we only can check the code and guess where potential issues reside.

Hi,we have about 500 tx1,the problem happend on only this chip.
we can’t reproduce this problem on other tx1.
It may be hardware problem ? we have no method to check it.

In gpuConvertYUYVtoRGB,i can make sure this function is executing all the time when issues happen.

cudaMemcpy(dst, d_dst, planeSize * 3, cudaMemcpyDeviceToHost);

so after this function
gpuConvertYUYVtoRGB_kernel<<<numBlocks, blockSize>>>(d_Y, d_U, d_V, d_dst, y_stride, u_stride, v_stride, width, height);
the d_dst is bad when issues happen.

We use ImageDecode class to decoder frame from camera,and you have the code.
First image_decode_->Init(frame.buf, frame.len);
then image_decode_->DecodeYUV2BGR(jpgBuffer, jpgSize, pImage); //pImage is buffer save rgb buffer
Because this node have relationship with other nodes ,you can’t run this dependency.

Hi,
We have tegra_multimedia_api samples. Do you see any issue if you run the default samples on the specific TX1?

Hi,
When i run the default samples ,such as samples/12_camera_v4l2_cuda/camera_v4l2_cuda.cpp,it is normal after every reboot.

There is another clue is that the problem occur first time when bootup,kill them and restart app,it never happend again.

And at the moment issue happend,the kernel log print below:
[CUDA RegDev Program [DevZone]]
5705 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.711180] ---- mlocks ----
5706 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.714121]
5707 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.715624] ---- syncpts ----
5708 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.718628] id 14 (57000000.gpu_507) min 64068 max 64068 refs 1 (previous client : )
5709 Jul 15 07:54:45 tegra-ubuntu rsyslogd-2007: action ‘action 9’ suspended, next retry is Mon Jul 15 07:55:15 2019 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
5710 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.726427] id 15 (57000000.gpu_506) min 166 max 166 refs 1 (previous client : )
5711 Jul 15 07:54:45 tegra-ubuntu kernel: [ 70.733886] id 16 (57000000.gpu_505) min 18 max 18 refs 1 (previous client : )
5712 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.741162] id 17 (57000000.gpu_504) min 34 max 34 refs 1 (previous client : )
5713 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.748445] id 18 (57000000.gpu_503) min 272 max 272 refs 1 (previous client : )
5714 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.755904] id 20 (57000000.gpu_502) min 45746 max 45746 refs 1 (previous client : )
5715 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.763712] id 21 (57000000.gpu_501) min 21016 max 21018 refs 1 (previous client : )
5716 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.773213] id 22 (57000000.gpu_500) min 23192 max 23192 refs 1 (previous client : )
5717 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.782782] id 23 (57000000.gpu_499) min 26332 max 26332 refs 1 (previous client : )
5718 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.792372] id 24 (57000000.gpu_498) min 8644 max 8646 refs 1 (previous client : )
5719 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.801892] id 25 (54340000.vic_CameraDecodeNod_0) min 3141 max 3141 refs 1 (previous client : )
5720 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.812789] id 29 (54380000.nvjpg_CameraDecodeNod_0) min 1048 max 1048 refs 1 (previous client : )
5721 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.824016] id 30 (57000000.gpu_497) min 2646 max 2646 refs 1 (previous client : )
5722 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.833969] id 31 (57000000.gpu_496) min 7406 max 7408 refs 1 (previous client : )
5723 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.844027] id 32 (57000000.gpu_495) min 2 max 2 refs 1 (previous client : )
5724 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.852389] id 33 (57000000.gpu_494) min 2 max 2 refs 1 (previous client : )
5725 Jul 15 07:54:46 tegra-ubuntu kernel: [ 70.860717] id 34 (57000000.gpu_493) min 27216 max 27220 refs 1 (previous client : )

Is there any clue in these logs?

Hi,could i call your cell phone ?

Hi,
As a quick solution, please use /etc/rc.local to execute a dummy run after boot-up. Below is an example:

gst-launch-1.0 videotestsrc ! nvvidconv ! nvoverlaysink &
sleep 15
prid=`/bin/ps -fu $USER | grep 'gst-launch' | grep -v "grep" | awk '{print $2}'`
kill prid
exit 0

Please modify gst-launch-1.0 videotestsrc ! nvvidconv ! nvoverlaysink & and ‘gst-launch’ according to your application.

Hi,when bootup ,the sink is nomal ,but kernel log has some abnomal.
file attached below.
syslog.txt (1.48 MB)

Hi,
Since we are not able to reproduce the issue. Also it happens to a specific TX1 module. It is more like a hardware issue.
Please consider RMA https://developer.nvidia.com/embedded/faq#rma-process