Anthony Chen - Projects

Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu

arXiv

Data

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, Xinyi Wu, Enrico Shippole, Kurt Bollacker, Tongshuang Wu, Luis Villa, Sandy Pentland, Sara Hooker

arXiv

Website Code Slides Poster

Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP

Anthony Chen, Pallavi Gudipati, Shayne Longpre, Xiao Ling and Sameer Singh

Association for Computational Linguistics (ACL-IJCNLP) 2021

Website Code Data Slides

MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics

Anthony Chen, Gabriel Stanovsky, Sameer Singh, and Matt Gardner

Empirical Methods in Natural Language Processing (EMNLP) 2020

Website Code Data Video Slides Demo

Evaluating Question Answering Evaluation Best Paper

Anthony Chen, Gabriel Stanovsky, Sameer Singh and Matt Gardner

Machine Reading for Question Answering Workshop @ Empirical Methods in Natural Language Processing (EMNLP) 2019

Data Slides

Publications

Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

The Data Provenance Project

PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions

RARR: Researching and Revising What Language Models Say, Using Language Models

Entity-Based Knowledge Conflicts in Question Answering

Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP

MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics

Evaluating Question Answering Evaluation Best Paper