Fast CUDA Grammar eXtractor for Statistical Machine Translation
This is a GPU grammar extractor for statistical machine translation. This tool can extract hierarchical translation grammars on GPU efficiently.
For more details, please refer to the following two papers:
Approximate Pattern Matching on GPUs for On-Demand Extraction of Hierarchical Translation Grammars. Hua He, Jimmy Lin, and Adam Lopez. Transactions of the Association for Computational Linguistics (TACL). 2015 January. In press.
Massively Parallel Suffix Array Queries and On-demand Phrase Extraction for Statistical Machine Translation using GPUs. Hua He, Jimmy Lin, and Adam Lopez. North American Chapter of the Association for Computational Linguistics (NAACL). 2013.
We will further update our codes significantly in our next release to make this tool easier to use... Stay tuned.
You are very welcome to share your usage experiences with us. Thank you.
Install CUDA library and CUDA driver on your machine. Please follow the instruction on Nvidia website: https://developer.nvidia.com/cuda-zone
This program requires the GPU device to have at least 4GB GPU memory. The codes work with Kephler/Fermi/pre-Fermi architecture GPUs.
Please download thrust GPU library, simply git clone its repo from here: https://github.com/thrust/thrust . It should normally work well, just in case to avoid any future version incompatibility since thrust library is constantly updated, you can also switch to its git branch: 1.8.0 version.
Makefile, we have:
NVCC =nvcc -arch=compute_35 -code=sm_35 CUDA_INSTALL_PATH= /opt/common/cuda/cuda-5.5.22 OPT = -O3 -I./uthash/ -I/scratch0/huah/thrust/
The above three variables in
MAKEFILE need to be updated according to your runnning enviroment. For example, the CUDA library install path
CUDA_INSTALL_PATH needs to be set to the corresponding path on your GPU machine;
OPT needs thrust library directory's path; so is the computing version of your GPUs (For example, Tesla K20 is 3.5 therfore it is
-arch=compute_35 -code=sm_35); etc.
In the main directory please compile the codes as below. Probably you will see lots of warnings please just ignore those as long as there are no errors. If errors that could probably be CUDA library related issues, please update your CUDA driver and library to the latest version.
cdec, currently you may still need its SMT decoder to use our generated hierarchical grammars and also its intermidiate output (lexical file as shown below), therefore please install
cdecfollowing its instructions and tutorials here: http://www.cdec-decoder.org/guide/
./bin/strmatchcuda ./toy/hansards.f ./toy/query.f ./toy/hansards.e ./toy/hansards.a ./toy/lex.bin gpugrammar_temp
Please provide the following as input arguments:
In the end if you see output log is like the below, this means everything has been done and it is in the last printing step (
IO step as in the paper):
Start Printing Gappy Phrases...
Once done just go check the gpugrammar_temp directory and there should be bunch of grammar files for queries (one file for each query).
Short description of running process: first of all read in parallel corpus and queries, then construct suffix array and auxiliary data structures for indexing, then do GPU-based gappy phrase lookup on suffix array and hierarchical grammar extraction, finally calculate features and output. Please refer to papers for more details on each pass.
Currently the running of this program includes one-time costs (e.g. suffix array construction/precomputation) and real hierarchical grammar extraction costs. Therefore depending on your parallel corpus typically this can take some very acceptable extra time before real GPU grammar extraction starts. This program will get updates and such one-time costs will be separated in future release.
Troubleshooting. Common installation problems are mainly from CUDA memory allocation/access side, causing weido memory problems during running. We encourage you to use the latest GPU driver and CUDA library.
Contact email address: huah at umd.edu ; alopez at inf.ed.ac.uk ; jimmylin at umd.edu