An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs

Youngsub Ko, Youngmin Yi, Soonhoi Ha

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

H.264/AVC video encoders have been widely used for its high coding efficiency. Since the computational demand proportional to the frame resolution is constantly increasing, it has been of great interest to accelerate H.264/AVC by parallel processing. Recently, graphics processing units (GPUs) have emerged as a viable target for accelerating general purpose applications by exploiting fine-grain data parallelisms. Despite extensive research efforts to use GPUs to accelerate the H.264/AVC algorithm, it has not been successful to achieve any speed-up over the x264 algorithm that is known as the fastest CPU implementation, mainly due to significant communication overhead between the host CPU and the GPU and intra-frame dependency in the algorithm. In this paper, we propose a novel motion-estimation (ME) algorithm tailored for NVIDIA GPU implementation. It is accompanied by a novel pipelining technique, called sub-frame ME processing, to effectively hide the communication overhead between the host CPU and the GPU. Further, we incorporate frame-level parallelization technique to improve the overall throughput. Experimental results show that our proposed H.264 encoder has higher performance than x264 encoder.

Original languageEnglish
Pages (from-to)5-18
Number of pages14
JournalJournal of Real-Time Image Processing
Volume9
Issue number1
DOIs
StatePublished - Mar 2014

Keywords

  • CUDA
  • GPU
  • H.264
  • Motion estimation

Fingerprint

Dive into the research topics of 'An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs'. Together they form a unique fingerprint.

Cite this