real time 1080p video processing

pidanchen · September 22, 2011, 10:15pm

Hello,

I am using CUDA to process some full 1080p HD video stream. What is the best strategy to do this kind of work? Rightnow I am doing this:

while( frame ){

using Windows API (either the old VFW or the new MediaFoundation) to read frame i in CPU
copy frame i to the GPU memory with CudaMemCpy
process the frame i within CUDA
copy the output back to CPU
render the output frame with OpenGL
}

However, I found that step1 takes a tremendous amount of time comparing to the rest steps. For an uncompressed video, 2G Byte with 500 frames, loading each frame in CPU may take around 100ms . Does anyone have similar experience?

alrikai · September 23, 2011, 12:50am

I’ve never done it with an uncompressed video format; I think if you’re working with video frames that large it’s going to be pretty slow no matter what. If you used compressed video formats, reading the frame into memory would be faster, and you could use the CUVID decoder for the decoding. One added benefit of using the CUVID decoder is that you could do the processing and display without copying the frame data back to the host - you can do everything on the GPU beyond reading the frame initially.

Olugbone86 · December 31, 2011, 1:21pm

Hello is it possible we see the code you typed

MattWarmuth · January 5, 2012, 3:35pm

I’m working with HD (1920x1080) imagery, and you have to understand I/O bounds to get the problem (and solution) here. 1920x1080 pixels x 3 bytes/pixel (1 byte per RGB) x 30 frames per second =~ 180 MB/sec. No single (non-SSD) hard drive can support this rate (much less GB ethernet). In order to get to realistic rates of what the CameraLink (up to 6 Gb/sec) or HD-SDI (1.5 Gb/sec) input can support, you need to pull the whole video stream into RAM, then move the frames over to the graphics card one at a time to emulate the ‘real’ video stream rate. Solid-State hard drives have better I/O rates, and you may be able to directly emulate an HD source if you read the data in from one of those instead.

That being said, once I pull the data into host (CPU) memory, I can emulate ‘live’ camera data at 200+ frames/sec.

Hope this answer’s the question you’re asking. Don’t know whether you’ll have the RAM needed to load all 500 frames in memory (2 MP x 3 Bpp x 500 frames = 3 GB, requires 64-bit OS for that much single process RAM).

BTW, why bring it back out to the CPU? You can render in OpenGL directly from the card with the CUDA interop? I’m doing simple stuff, so I didn’t need to rewrite my code to accomodate the interop. However, if you’re modifying existing OpenGL code to use CUDA for part of it, I can understand it may be difficult to modify.

erroy · April 18, 2013, 10:26am

i am new to the CUDA stuff but what about the new GPUdirect? would that solve some problems with the transfer from cameralink to the gpu?

benju · April 22, 2013, 4:39pm

I would also be very interested in an answer to that question (what is the fastest way to transfer data from a CameraLink card to the GPU). Is there any way to directly transfer data between CameraLink card and GPU without going through the host memory?

Topic		Replies	Views
Real Time image Processing CUDA CUDA Programming and Performance	6	7189	May 8, 2012
Can start to parse video in specific frame(not the begin) of the video with codec library? Video Processing & Optical Flow	17	1981	October 12, 2021
Any example on real time video processing CUDA Programming and Performance	12	3872	January 6, 2012
Transferring frame data for TV input Need to process many frames of TV data CUDA Programming and Performance	2	2211	February 7, 2008
The GPU utilization is low CUDA Programming and Performance	3	2022	November 14, 2014
Image processing is faster on CPU than with CUDA CUDA Programming and Performance	1	681	September 15, 2018
processing video frames with CUDA CUDA Programming and Performance	3	2390	July 12, 2010
Odd performance problem/question CUDA Programming and Performance	3	832	June 3, 2009
Slow down with multiple CUDA files CUDA Programming and Performance	8	4716	September 7, 2010
CUDA for real-time video processing? CUDA Programming and Performance	1	4257	April 24, 2007

real time 1080p video processing

Related topics