Efficient parallel CKY parsing using GPUs

Youngmin Yi, Chao Yue Lai, Slav Petrov

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


Low-latency solutions for syntactic parsing are needed if parsing is to become an integral part of user-facing natural language applications. Unfortunately, most state-of-the-art constituency parsers employ large probabilistic context-free grammars for disambiguation, which renders them impractical for real-time use. Meanwhile, Graphics Processor Units (GPUs) have become widely available, offering the opportunity to alleviate this bottleneck by exploiting the fine-grained data parallelism found in the Cocke-Kasami-Younger (CKY) algorithm. In this article, we explore the design space of parallelizing the dynamic programming computations carried out by the CKY algorithm. We use the Compute Unified Device Architecture (CUDA) programming model to reimplement a state-of-the-art parser, and compare its performance on three recent GPUs with different architectural features. Our best results show a 33-fold speedup for the CUDAparser compared to a sequential C implementation.

Original languageEnglish
Pages (from-to)375-393
Number of pages19
JournalJournal of Logic and Computation
Issue number2
StatePublished - Apr 2014


  • CKY parsing
  • CUDA
  • GPU
  • Viterbi parsing
  • parallel parsing


Dive into the research topics of 'Efficient parallel CKY parsing using GPUs'. Together they form a unique fingerprint.

Cite this