FFMPEG NVIDIA Accelerated Video Encoding

Hello everyone, I need to decode a video, analyze frames and encode it back. I am able to do it using FFMPEG C++ without GPU. However, I am having problems with doing it on NVIDIA GPU.

So far, I am able to decode the video and analyze it using ffmpeg c++ api and cuda programming. Unfortunately, I am unable to encode the data back to video. The program creates a video file (12.5Mb filesize which is smaller than expected) but it doesn’t open in VLC player.

The decoded AVframe is already present in GPU memory. Here are the AVcodecContext parameters for encoding:

AVcodec_context->width = dst_width;
AVcodec_context->height = dst_height;
AVcodec_context->time_base = (AVRational) {1, 25};
AVcodec_context->framerate = (AVRational) {25, 1};
AVcodec_context->pix_fmt = AV_PIX_FMT_CUDA;
AVcodec_context->gop_size = 10;
AVcodec_context->max_b_frames = 1;

Here is a code snippet creating device context and frames:

AVBufferRef *hw_device_ctx_encoding = NULL;
av_hwdevice_ctx_create(&hw_device_ctx_encoding, AV_HWDEVICE_TYPE_CUDA, NULL, NULL, 0);

AVBufferRef *hw_frames_ref_encoding = NULL;
hw_frames_ref_encoding = av_hwframe_ctx_alloc(hw_device_ctx_encoding);

AVHWFramesContext *frames_ctx;
frames_ctx = (AVHWFramesContext *)(hw_frames_ref_encoding->data);
frames_ctx->format = AV_PIX_FMT_CUDA;
frames_ctx->sw_format = AV_PIX_FMT_YUV420P;
frames_ctx->width = dst_width;
frames_ctx->height = dst_height;

AVcodec_context->hw_frames_ctx = av_buffer_ref(hw_frames_ref_encoding);

AVFrame hw_frame = av_frame_alloc();
decframe = av_frame_alloc();

err = av_hwframe_get_buffer(encoding_codec_context->hw_frames_ctx, hw_frame, 0)

hw_frame->pts = count;
av_hwframe_transfer_data(hw_frame, decframe, 0); // decframe is the decoded data frame already present in GPU memory

// Afterwards lets say I am using the encode function shown here: FFmpeg: encode_video.c
encode(AVcodec_context, hw_frame, encoding_pkt, fout);

Can anyone please explain how to encode from YUV data already present in GPU memory. Does one need to create video on GPU and then copy it back to system memory or fwrite() function in C++ writes it back to system memory itself? I am not sure where is it that I am going wrong.

I recommend asking this question here: Video Processing & Optical Flow - NVIDIA Developer Forums