[HOWTO] H.264 + MP4 container

Hi,

After couple of hours I finally got CUDA H.264 encoder in MP4 working! Because of this is not well documented, I’ll decide to write a small howto.
The final MP4 file is playable in VLC and in Windows Media Player.
BTW I recommend to use VLC instead of WM, because most of my MP4 “attemps” are playing correctly in VLC, but not in WM…

For storing into MP4 I’m using FFMPEG library (API).

Encoder special settings:
NVVE_CONFIGURE_NALU_FRAMING_TYPE = 4
NVVE_DISABLE_SPS_PPS = 0 or 1 (both are working)

MP4 Init:

//struct AVFormatContext *m_formatContext;
//struct AVCodecContext *m_videoContext;
//struct AVStream *m_videoStream;

// setup FFMPEG - mp4 container
const char *filename = "file.mp4";
AVOutputFormat* fmt = av_guess_format(0, filename, 0);
if(!fmt) return false;
m_formatContext = avformat_alloc_context();

m_formatContext->oformat = fmt;
strcpy(m_formatContext->filename, filename);

m_videoStream = avformat_new_stream(m_formatContext, 0);
if(!m_videoStream)
{
    avformat_free_context(m_formatContext);
    m_formatContext = NULL;
    return false;
}
 
m_videoContext = m_videoStream->codec;      
m_videoContext->codec_type = AVMEDIA_TYPE_VIDEO;
m_videoContext->codec_id = AV_CODEC_ID_H264;
m_videoContext->bit_rate = bitrate;
m_videoContext->width = width;
m_videoContext->height = height;
m_videoContext->time_base.den = fps;
m_videoContext->time_base.num = 1;    
m_videoContext->gop_size = gop; // = idrLevel?
m_videoContext->pix_fmt = AV_PIX_FMT_NV12;

// get SPS+PPS
int spsppsSize;
uint8_t spspps[256];
NVGetSPSPPS(m_encoder, spspps, 256, &spsppsSize);
int spsSize = spspps[1];
int ppsSize = spspps[spsSize + 3];

// setup extradata
m_videoContext->extradata_size = spsppsSize + 7;
uint8_t *extra = (uint8_t *)av_mallocz(m_videoContext->extradata_size);
extra[0] = 1; // version
extra[1] = spspps[3]; // profile
extra[2] = spspps[4]; // compatibility
extra[3] = spspps[5]; // level
extra[4] = 0xFC | 3;  // reserved (6 bits), NALU length size - 1 (2 bits)
extra[5] = 0xE0 | 1;  // reserved (3 bits), num of SPS (5 bits) 
uint8_t *pExtra = extra + 6;
memcpy(pExtra, spspps, spsSize+2);
pExtra += spsSize+2;
*pExtra++ = 1; // num of PPS
memcpy(pExtra, spspps+2+spsSize, ppsSize+2);
m_videoContext->extradata = extra; 

if(m_formatContext->oformat->flags & AVFMT_GLOBALHEADER) m_videoContext->flags |= CODEC_FLAG_GLOBAL_HEADER;

av_dump_format(m_formatContext, 0, filename, 1);

if(!(m_formatContext->oformat->flags & AVFMT_NOFILE))
{
        if(avio_open(&m_formatContext->pb, filename, AVIO_FLAG_WRITE) < 0) 
	{
		avformat_free_context(m_formatContext);
		m_formatContext = NULL;
		return false;
	}
}
avformat_write_header(m_formatContext, NULL);

Notes: A little bit tricky is providing extradata. If you don’t set valid extradata, your file will be not playable in WM.
For more info see:
http://stackoverflow.com/questions/15263458/h-264-muxed-to-mp4-using-libavformat-not-playing-back

EndFrame event:

AVPacket pkt;
av_init_packet(&pkt);

if(pefi->nPicType == NVVE_PIC_TYPE_IFRAME) pkt.flags |= AV_PKT_FLAG_KEY;
pkt.stream_index  = m_videoStream->index;
pkt.data          = m_frameBuffer;
pkt.size          = m_frameBufferSize;

pkt.pts = av_rescale_q(pefi->nFrameNumber, m_videoContext->time_base, m_videoStream->time_base);
pkt.dts = av_rescale_q(m_dts++, m_videoContext->time_base, m_videoStream->time_base);

av_interleaved_write_frame(m_formatContext, &pkt);

av_free_packet(&pkt);

Notes: If the stream includes B-frames, you must provide valid DTS/PTS, where PTS>=DTS. For P-interval 3, I initialize m_dts to -3.
m_frameBuffer includes whole frame data…

Stop encoding:

if(m_formatContext)
{
	av_write_trailer(m_formatContext);
	if(!(m_formatContext->oformat->flags & AVFMT_NOFILE)) avio_close(m_formatContext->pb); // close the output file
	avformat_free_context(m_formatContext);
	m_formatContext = NULL;
}

Code was cut&paste from my app and it is not directly compilable. I think for those who are insterested, it could be helpful :)

Martin

1 Like