Publications

Topics: / / / / /

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, and Luke Zettlemoyer. To Appear in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
[paper] [models] [code] [poster]
Targeted Multilingual Adaptation for Low-resource Language Families C.M. Downey, Terra Blevins, Dhwani Serai, Dwija Parikh, and Shane Steinert-Threlkeld. To Appear in Findings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
[paper]
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models Hila Gonen, Terra Blevins, Alisa Liu, Luke Zettlemoyer, and Noah A. Smith. Preprint.
[paper]
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, and Luke Zettlemoyer. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024.
[paper]
Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen, Joseph Marvin Imperial, Börje F. Karlsson, Peiqin Lin, Nikola Ljubešić, LJ Miranda, Barbara Plank, Arij Riabi, Yuval Pinter. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
[paper] [UNER project] [slides] [poster]
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer Akari Asai, Sneha Kudugunta, Xinyan Velocity Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
[paper]
Comparing Hallucination Detection Metrics for Multilingual Generation Haoqiang Kang, Terra Blevins, and Luke Zettlemoyer. Preprint, 2024.
[paper]
Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models Haoqiang Kang*, Terra Blevins*, and Luke Zettlemoyer. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024.
[paper] [slides]
Detecting Pretraining Data from Large Language Models Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer. In the International Conference on Learning Representations (ICLR), 2024.
[paper]
Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages C.M. Downey, Terra Blevins, Nora Goldfine, and Shane Steinert-Threlkeld. In Proceedings of the 3rd Multilingual Representation Learning (MRL) Workshop, 2023. Awarded Best Paper!
[paper]
Demystifying Prompts in Language Models via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, and Luke Zettlemoyer. In Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
[paper]
Prompting Language Models For Linguistic Structure Terra Blevins, Hila Gonen, and Luke Zettlemoyer. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
[paper] [code] [poster] [the gradient]
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models Terra Blevins, Hila Gonen, and Luke Zettlemoyer. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
[paper] [models] [poster]
Language Contamination Helps Explain the Cross-lingual Capabilities of English Pretrained Models Terra Blevins and Luke Zettlemoyer. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
[paper] [poster]
Few-shot Mining of Naturally Occurring Inputs and Outputs Mandar Joshi, Terra Blevins, Mike Lewis, Daniel S. Weld, and Luke Zettlemoyer. Preprint, 2022.
[paper]
FEWS: Large-scale, Low-shot Word Sense Disambiguation with the Dictionary Terra Blevins, Mandar Joshi, and Luke Zettlemoyer. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021.
[paper] [data] [slides] [poster]
Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders Terra Blevins and Luke Zettlemoyer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 2020.
[paper] [code] [slides]
Better Character Language Modeling Through Morphology Terra Blevins and Luke Zettlemoyer. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
[paper] [poster] [slides]
Deep RNNs Encode Soft Hierarchical Syntax Terra Blevins, Omer Levy, and Luke Zettlemoyer. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 2018.
[paper] [poster]
Automatically Processing Tweets from Gang-Involved Youth: Towards Detecting Loss and Aggression Terra Blevins, Robert Kwiatkowski, Jamie MacBeth, Kathleen McKeown, Desmond Patton, and Owen Rambow. In Proceedings of the International Conference on Computational Linguistics (COLING), 2016.
[paper]
Mining Paraphrasal Typed Templates from a Plain Text Corpus Or Biran, Terra Blevins, and Kathleen McKeown. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), 2016.
[paper]