← Back to Home

Publications

Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu

arXiv

Data

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, Xinyi Wu, Enrico Shippole, Kurt Bollacker, Tongshuang Wu, Luis Villa, Sandy Pentland, Sara Hooker

arXiv

The Data Provenance Project

Shayne Longpre, Robert Mahari, Niklas Muennighoff, Anthony Chen, Kartik Perisetla, William Brannon, Jad Kabbara, Luis Villa, Sara Hooker

Generative AI + Law (GenLaw) @ International Conference on Machine Learning (ICML) 2023

PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions

Anthony Chen, Panupong Pasupat, Sameer Singh, Hongrae Lee and Kelvin Guu

arXiv

RARR: Researching and Revising What Language Models Say, Using Language Models

Luyu Gao, Zhuyun Dai, Panupong Pasupat, Anthony Chen, Arun Tejasvi Chaganty, Yicheng Fan, Vincent Y. Zhao, Ni Lao, Hongrae Lee, Da-Cheng Juan and Kelvin Guu

Association for Computational Linguistics (ACL) 2023

Entity-Based Knowledge Conflicts in Question Answering

Shayne Longpre, Kartik Perisetla, Anthony Chen, Nikhil Ramesh, Chris DuBois and Sameer Singh

Empirical Methods in Natural Language Processing (EMNLP) 2021

Website Code Slides Poster

Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP

Anthony Chen, Pallavi Gudipati, Shayne Longpre, Xiao Ling and Sameer Singh

Association for Computational Linguistics (ACL-IJCNLP) 2021

Website Code Data Slides

MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics

Anthony Chen, Gabriel Stanovsky, Sameer Singh, and Matt Gardner

Empirical Methods in Natural Language Processing (EMNLP) 2020

Website Code Data Video Slides Demo

Evaluating Question Answering Evaluation Best Paper

Anthony Chen, Gabriel Stanovsky, Sameer Singh and Matt Gardner

Machine Reading for Question Answering Workshop @ Empirical Methods in Natural Language Processing (EMNLP) 2019

Data Slides