David Chiang 蔣偉

Associate Professor, Computer Science and Engineering
Natural Language Processing Group

My research is in natural language processing, the subfield of computer science that aims to enable computers to understand and produce human language. I focus mainly on language translation, and am interested in syntactic parsing and other areas as well.

Teaching

Recent and selected publications

Andy Yang, Lena Strobl, David Chiang, and Dana Angluin. Simulating hard attention using soft attention. arXiv:2412.09925. PDF BibTeX
David Chiang. Transformers in uniform TC\(^0\). arXiv:2409.13629. PDF BibTeX
Andy Yang, David Chiang, and Dana Angluin. Masked hard-attention transformers recognize exactly the star-free languages. In Proc. NeurIPS. 2024. To appear. PDF BibTeX
Ken Sible and David Chiang. Improving rare word translation with dictionaries and attention masking. In Proc. AMTA. 2024. PDF BibTeX
Andy Yang and David Chiang. Counting like transformers: compiling temporal counting logic into softmax transformers. In Proc. CoLM. 2024. PDF BibTeX
Aarohi Srivastava and David Chiang. We're calling an intervention: taking a closer look at language model adaptation to different types of linguistic variation. 2024. arXiv:2404.07304. PDF BibTeX
Lena Strobl, Dana Angluin, David Chiang, Jonathan Rawski, and Ashish Sabharwal. Transformers as transducers. Transactions of the Association for Computational Linguistics, 2024. To appear. PDF BibTeX
Chihiro Taguchi and David Chiang. Language complexity and speech recognition accuracy: orthographic complexity hurts, phonological complexity doesn't. In Proc. ACL. 2024. Outstanding Paper Award and Senior Area Chair Award. PDF BibTeX
Fahim Faisal, Orevaoghene Ahia, Aarohi Srivastava, Kabir Ahuja, David Chiang, Yulia Tsvetkov, and Antonios Anastasopoulos. DIALECTBENCH: a NLP benchmark for dialects, varieties, and closely-related languages. In Proc. ACL. 2024. Social Impact Award. PDF BibTeX
Stephen Bothwell, Brian DuSell, David Chiang, and Brian Krostenko. PILA: a historical-linguistic dataset of Proto-Italic and Latin. In Proc. LREC-COLING, 12749–12760. 2024. PDF BibTeX
Chihiro Taguchi, Jefferson Saransig, Dayana Velásquez, and David Chiang. KILLKAN: the automatic speech recognition dataset for Kichwa with morphosyntactic information. In Proc. LREC-COLING, 9753–9763. 2024. PDF BibTeX
Lena Strobl, William Merrill, Gail Weiss, David Chiang, and Dana Angluin. What formal languages can transformers express? A survey. Transactions of the Association for Computational Linguistics, 12:543–561, 2024. doi:10.1162/tacl_a_00663. DOI BibTeX
Brian DuSell and David Chiang. Stack attention: improving the ability of transformers to model hierarchical patterns. In Proc. ICLR. 2024. Spotlight paper. PDF BibTeX
Stephen Bothwell, Justin DeBenedetto, Theresa Crnkovich, Hildegund Müller, and David Chiang. Introducing rhetorical parallelism detection: a new task with datasets, metrics, and baselines. In Proc. EMNLP, 5007–5039. 2023. doi:10.18653/v1/2023.emnlp-main.305. PDF BibTeX
Aarohi Srivastava and David Chiang. BERTwich: extending BERT's capabilities to model dialectal and noisy text. In Findings of ACL: EMNLP. 2023. PDF BibTeX
Chihiro Taguchi, Yusuke Sakai, Parisa Haghani, and David Chiang. Universal automatic phonetic transcription into the International Phonetic Alphabet. In Proc. INTERSPEECH. 2023. doi:10.21437/Interspeech.2023-2584. PDF BibTeX
David Chiang, Peter Cholak, and Anand Pillay. Tighter bounds on the expressivity of transformer encoders. In Proc. ICML, 5544–5562. 2023. PDF BibTeX
Aarohi Srivastava and David Chiang. Fine-tuning BERT with character-level noise for zero-shot transfer to dialects and closely-related languages. In Proc. Workshop on NLP for Similar Languages, Varieties and Dialects. 2023. PDF BibTeX
David Chiang, Colin McDonald, and Chung-chieh Shan. Exact recursive probabilistic programming. PACMPL, 2023. doi:10.1145/3586050. PDF BibTeX
David Chiang, Alexander M. Rush, and Boaz Barak. Named tensor notation. Transactions on Machine Learning Research, January 2023. PDF BibTeX
Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, and David Chiang. Algorithms for weighted pushdown automata. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Proc. EMNLP, 9669–9680. 2022. doi:10.18653/v1/2022.emnlp-main.656. PDF BibTeX
David Chiang and Peter Cholak. Overcoming a theoretical limitation of self-attention. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proc. ACL, volume 1, 7654–7664. 2022. doi:10.18653/v1/2022.acl-long.527. PDF BibTeX

full list