An efficient parallel motion estimation algorithm and X264 parallelization in CUDA

Youngsub Ko, Youngmin Yi, Soonhoi Ha

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

H.264/AVC video encoders have been widely used for its high coding efficiency. Since the computational demand proportional to the frame resolution is constantly increasing, it has been of great interest to accelerate H.264/AVC by parallel processing. Recently, graphics processing units (GPUs) have emerged as a viable target for accelerating general purpose applications by exploiting fine-grain data parallelisms. Despite extensive research effort to use GPUs to accelerate the H.264/AVC algorithm, it has not been successful to achieve any speed-up over the x264 algorithm that is known as the fastest CPU implementation because of significant communication overhead between the host CPU and the GPU and intra-frame dependency in the algorithm. In this paper, we propose a novel motion estimation (ME) algorithm tailored for NVIDIA GPU implementation. It is accompanied by a novel pipelining technique, called sub-frame ME processing, to effectively hide the communication overhead between the host CPU and the GPU. The proposed H.264 encoder achieves more than 20% speed-up compared with x264.

Original languageEnglish
Title of host publicationProceedings of the 2011 Conference on Design and Architectures for Signal and Image Processing, DASIP 2011
Pages91-98
Number of pages8
DOIs
StatePublished - 2011
Event2011 Conference on Design and Architectures for Signal and Image Processing, DASIP 2011 - Tampere, Finland
Duration: 2 Nov 20114 Nov 2011

Publication series

NameConference on Design and Architectures for Signal and Image Processing, DASIP
ISSN (Print)2164-9766

Conference

Conference2011 Conference on Design and Architectures for Signal and Image Processing, DASIP 2011
Country/TerritoryFinland
CityTampere
Period2/11/114/11/11

Keywords

  • CUDA
  • GPU
  • H.264
  • Motion Estimation

Fingerprint

Dive into the research topics of 'An efficient parallel motion estimation algorithm and X264 parallelization in CUDA'. Together they form a unique fingerprint.

Cite this