machine translation
formal grammars
biological sequence analysis
language documentation
journal articles
conference papers
An unsupervised probability model for speech-to-translation alignment of low-resource languages. Antonios Anastasopoulos and David Chiang, 2016. In Proc. EMNLP.
Growing graphs from hyperedge replacement graph grammars. Salvador Aguinaga, Rodrigo Palacios, David Chiang, and Tim Weninger, 2016. In Proc. CIKM. PDF
An attentional model for speech translation without transcription. Long Duong, Antonios Anasatasopoulos, Trevor Cohn, Steven Bird, and David Chiang, 2016. In Proc. NAACL HLT.
Auto-Sizing Neural Networks: With Applications to n-gram Language Models. Kenton Murray and David Chiang, 2015. In Proc. EMNLP.
Supervised Phrase Table Triangulation with Neural Word Embeddings for Low-Resource Languages. Tomer Levinboim and David Chiang, 2015. In Proc. EMNLP.
Model Invertibility Regularization: Sequence alignment with or without parallel data. Tomer Levinboim, Ashish Vaswani and David Chiang, 2015. In Proc. NAACL HLT, pages 609–618. PDF
Multi-task word alignment triangulation for low-resource languages. Tomer Levinboim and David Chiang, 2015. In Proc. NAACL HLT, pages 1221–1226. PDF
Improving word alignment using word similarity. Theerawat Songyot and David Chiang, 2014. In Proc. EMNLP. PDF
Kneser-Ney smoothing on expected counts. Hui Zhang and David Chiang, 2014. In Proc. ACL, 765–774. PDF
The International Workshop on Language Preservation: An experiment in text collection and language technology. Steven Bird, David Chiang, Friedel Frowein, Andrea L. Berez, Mark Eby, Florian Hanke, Ryan Shelby, Ashish Vaswani, and Ada Wan, 2013. Language Documentation and Conservation 7:155–167. PDF
Decoding with large-scale neural language models improves translation. Ashish Vaswani, Yinggong Zhao, Victoria Fossum, and David Chiang, 2013. In Proc. EMNLP, 1387–1392. PDF
Parsing graphs with hyperedge replacement grammars. With Jacob Andreas, Daniel Bauer, Karl Moritz Hermann, Bevan Jones and Kevin Knight, 2013. In Proc. ACL, 924–932. PDF BibTeX
Synchronous grammars: tutorial given at TAG+11, 2012. PDF
Machine translation for language preservation. Steven Bird and David Chiang, 2012. In Proc. COLING, 125–134. PDF
Hope and fear for discriminative training of statistical translation models. J. Machine Learning Research 13 (2012): 1159–1187. A few typos corrected, in particular in the definition of the loss function. PDF
Smaller alignment models for better translations: unsupervised word alignment with the l0 norm. Ashish Vaswani, Liang Huang, and David Chiang, 2012. In Proc. ACL, 311–319. PDF code
An exploration of forest-to-string translation: Does translation help or hurt parsing? Hui Zhang and David Chiang, 2012. In Proc. ACL (Vol. 2: Short Papers), 317–321. PDF
Grammars for Language and Genes: Theoretical and Empirical Investigations. 2012. Springer. Book version of my PhD thesis.
Soft syntactic constraints for Arabic-English hierarchical phrase-based translation. Yuval Marton, David Chiang, and Philip Resnik, 2012. Machine Translation 26(1–2):137–157.
Rule Markov models for fast tree-to-string translation. Ashish Vaswani, Haitao Mi, Liang Huang and David Chiang, 2011. In Proc. ACL, 856–864. PDF BibTeX
Two easy improvements to lexical weighting. With Steve DeNeefe and Michael Pust, 2011. In Proc. ACL (Vol. 2: Short Papers), 455–460. PDF BibTeX
Language-independent parsing with empty elements. Shu Cai, David Chiang, and Yoav Goldberg, 2011. In Proc. ACL (Vol. 2: Short Papers), 212–216. PDF BibTeX
Models and training for unsupervised preposition sense disambiguation. Dirk Hovy, Ashish Vaswani, Stephen Tratz, David Chiang, and Eduard Hovy, 2011. In Proc. ACL (Vol. 2: Short Papers), 323–328. PDF
Learning to translate with source and target syntax. 2010. In Proc. ACL, 1443–1452. PDF Slides: PDF
Unsupervised syntactic alignment with inversion transduction grammars. Adam Pauls, Dan Klein, David Chiang, and Kevin Knight, 2010. In Proc. NAACL HLT, 118–126. PDF
Efficient optimization of an MDL-inspired objective function for unsupervised part-of-speech tagging. Ashish Vaswani, Adam Pauls, and David Chiang, 2010. In Proc. ACL, 209–214. PDF
Bayesian inference for finite-state transducers. With Jonathan Graehl, Kevin Knight, Adam Pauls, and Sujith Ravi, 2010. In Proc. NAACL HLT, 447–455. PDF
Fast, greedy model minimization for unsupervised tagging. Sujith Ravi, Ashish Vaswani, Kevin Knight, and David Chiang, 2010. In Proc. COLING, 940–948. PDF
Fast consensus decoding over translation forests. John DeNero, David Chiang, and Kevin Knight, 2009. In Proc. ACL, 567–575. PDF
11,001 new features for statistical machine translation. With Wei Wang and Kevin Knight, 2009. In Proc. NAACL HLT, 218–226. Best paper award. PDF Slides: PDF
Online large-margin training of syntactic and structural translation features. With Yuval Marton and Philip Resnik, 2008. In Proc. EMNLP, 224–233. PDF
Decomposability of translation metrics for improved evaluation and efficient algorithms. With Steve DeNeefe, Yee Seng Chan, and Hwee Tou Ng, 2008. In Proc. EMNLP, 610–619. PDF
Flexible composition and delayed tree-locality. With Tatjana Scheffler, 2008. In Proc. TAG+9, 17–24. PDF
Hierarchical phrase-based translation. 2007. Computational Linguistics 33(2):201–228. PDF Expands on the ACL 2005 paper below with a description, including experimental results, of the decoding algorithm (cube pruning).
Forest rescoring: faster decoding with integrated language models. Liang Huang and David Chiang, 2007. In Proc. ACL, 144–151. PDF
Word sense disambiguation improves statistical machine translation. Yee Seng Chan, Hwee Tou Ng, and David Chiang, 2007. In Proc. ACL, 33–40. PDF
Computational linguistics: a new tool for exploring biopolymer structures and statistical mechanics. Ken A. Dill, Adam Lucas, Julia Hockenmaier, Liang Huang, David Chiang, and Aravind K. Joshi, 2007. Polymer 48:4289–4300.
The hidden TAG model: synchronous grammars for parsing resource-poor languages. With Owen Rambow, 2006. In Proc. TAG+8, 1–8. PDF
Parsing Arabic dialects. With Mona Diab, Nizar Habash, Owen Rambow, and Safiullah Shareef, 2006. In Proc. EACL, 369–376. PDF
Grammatical representations of macromolecular structure. With Aravind K. Joshi and David B. Searls, 2006. J. Computational Biology 13(5):1077–1100.
A grammatical theory for the conformational changes of simple helix bundles. With Aravind K. Joshi and Ken A. Dill, 2006. J. Computational Biology 13(1):21–42.
The weak generative capacity of linear tree adjoining grammars. 2006. In Proc. TAG+8, 25–32. PDF
An introduction to synchronous grammars: part of a tutorial given at ACL 2006 with Kevin Knight. Notes: PDF Slides: PDF
A hierarchical phrase-based model for statistical machine translation. 2005. In Proc. ACL, 263–270. Best paper award. PDF
THe Hiero machine translation system: extensions, evaluation, and analysis. 2005. In Proc. HLT/EMNLP, 779–786. PDF
Better k-best parsing. Liang Huang and David Chiang, 2005. In Proc. IWPT, 53–64. Corrected version. PDF
Evaluation of Grammar Formalisms for Applications to Natural Language Processing and Biological Sequence Analysis, PhD dissertation, University of Pennsylvania, 2004. Rubinoff Award. PDF Slides from defense: PDF
Uses and abuses of intersected languages. 2004. In Proc. TAG+7, 9–15. PDF
On relations of constituency and dependency grammars. Mark Dras, David Chiang, and William Schuler, 2004. Research on Language and Computation 2:281–305.
Statistical parsing with an automatically extracted tree adjoining grammar. 2003. In Data Oriented Parsing, CSLI Publications, 299–316. Combines and updates results from papers at ACL 2000 and the colocated Chinese Language Processing Workshop. PDF
Recovering latent information in treebanks. With Daniel M. Bikel, 2002. In Proc. COLING, 183–189. PDF code
Putting some weakly context-free formalisms in order. 2002. In Proc. TAG+6, 11–18. PDF
Formal grammars for estimating partition functions of double-stranded chain molecules. With Aravind K. Joshi, 2002. In Proc. HLT, 63–67. PDF
Facilitating Treebank annotation using a statistical parser. Fu-Dong Chiou, David Chiang, and Martha Palmer, 2001. In Proc. HLT, 117–120. PDF
Statistical parsing with an automatically extracted tree adjoining grammar. 2000. In Proc. ACL, 456–463. PDF code
Multi-component TAG and notions of formal power. William Schuler, David Chiang, and Mark Dras, 2000. In Proc. ACL, 448–455. PDF
Some remarks on an extension of synchronous TAG. With William Schuler and Mark Dras, 2000. In Proc. TAG+5, 61–66. PDF