Hardware Encoder/Decoder of Video Streams in CUDA

I have a c++ program which receives a video stream and I was wondering if it is at all possible to encode that in real-time (in h264 using CUDA) ?

So far, I’ve written a “pass-through” thingy, which copies the stream on my GPU than back without doing anything to it but I needed to test CUDA on my hardware and it works just fine.

__global__ void streamCopy(unsigned char *a, unsigned char *b, int vSize) {

  /// How do I do h264 encoding ?
  int i = blockIdx.x;
  if (i < vSize) {
    b[i] = a[i];

extern "C" void encodeX264(unsigned char *streamIn, unsigned char *streamOut, int vectorSize) {

  unsigned char *d_streamIn, *d_streamOut;
  gpuErrchk(cudaMalloc((void **)&d_streamIn, vectorSize));
  gpuErrchk(cudaMalloc((void **)&d_streamOut, vectorSize));

  // Copy input data to array on GPU.
  gpuErrchk(cudaMemcpy(d_streamIn, streamIn, vectorSize, cudaMemcpyHostToDevice));

  streamCopy<<<1, 1>>>(d_streamIn, d_streamOut, vectorSize);

  // Copy output array from GPU back to CPU.
  gpuErrchk(cudaMemcpy(streamOut, d_streamOut, vectorSize, cudaMemcpyDeviceToHost));

  // Free up the arrays on the GPU.


I just don’t know how to turn my incoming stream into a h264 stream.
Is there any resources that would help ?

I am not looking for a ready made solution, I’m just new to this world of stream encoding and don’t really know where to start.


One option to consider is that GPUs have built in encode/decode hardware, and an SDK to access it:


There are separate sub-forums for questions about video codec sdk usage.