I am using opencv+ffmpeg with cuvid support to perform some video transcoding tasks. The basic setup is:
- nvidia driver: 430.34
- cuda: 10.1.168
- opencv: 4.1.0（JavaCV 1.5.1）
- ffmpeg: 4.1.3
- cuvid sdk: 9.0
To be specific, a task decodes a h264 rtsp live stream into frames, loads the frames into GpuMat for some furthur operations, then encodes the sequence back into h264 rtmp live stream. On Nvidia Quadro P2000, the overall memory usage for a task is approximately 260M, however, on Nvidia Tesla V100(16G), the usage rockets to over 1GB, with decoding/encoding taking up 730M and GpuMat taking up 320M.
It is weird that such simple task could use gigabytes of GPU memory, which seriously limits the scalability of the transcoding process.
Anyone knows why this is happenning and how can I limit the usage?
Please help me.