Abstract
Low-latency solutions for syntactic parsing are needed if parsing is to become an integral part of user-facing natural language applications. Unfortunately, most state-of-the-art constituency parsers employ large probabilistic context-free grammars for disambiguation, which renders them impractical for real-time use. Meanwhile, Graphics Processor Units (GPUs) have become widely available, offering the opportunity to alleviate this bottleneck by exploiting the fine-grained data parallelism found in the Cocke-Kasami-Younger (CKY) algorithm. In this article, we explore the design space of parallelizing the dynamic programming computations carried out by the CKY algorithm. We use the Compute Unified Device Architecture (CUDA) programming model to reimplement a state-of-the-art parser, and compare its performance on three recent GPUs with different architectural features. Our best results show a 33-fold speedup for the CUDAparser compared to a sequential C implementation.
Original language | English |
---|---|
Pages (from-to) | 375-393 |
Number of pages | 19 |
Journal | Journal of Logic and Computation |
Volume | 24 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2014 |
Keywords
- CKY parsing
- CUDA
- GPU
- Viterbi parsing
- parallel parsing