Da Yin

I am a postdoc researcher at Meta FAIR. I obtained Ph.D. in Computer Science at UCLA, advised by Kai‑Wei Chang, and was awarded the Amazon PhD Fellowship in 2023. My research focuses on empowering more adaptive reasoning and planning ability of LLM agents and enhancing cross‑region and cross‑task generalizability.

✨ Currently I am interested in:

🎓 Education

📝 Full publication list:

ICML 2025

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

Zongyu Lin*, Yao Tang*, Xingcheng Yao*, Da Yin*, Ziniu Hu, Yizhou Sun, Kai-Wei Chang

(* denotes equal contribution)
NAACL 2025

Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?

Xuan He*, Da Yin*, Nanyun Peng

(* denotes equal contribution)

Oral Presentation

CVPR 2025

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Xueqing Wu, Yuheng Ding, Bingxuan Li, Pan Lu, Da Yin, Kai-Wei Chang, Nanyun Peng

ICLR 2025

Bridging the Data Provenance Gap Across Text, Speech and Video

Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi Li, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N Lee, Campbell S Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester JV Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara

NeurIPS 2024

SafeWorld: Geo‑Diverse Safety Alignment

Da Yin*, Haoyi Qiu*, Kung‑Hsiang Huang, Kai‑Wei Chang, Nanyun Peng

(* denotes equal contribution)
ACL 2024

🪄 Agent Lumos: Unified and Modular Training for Open-Source Language Agents

Da Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin

Featured by Marktechpost GitHub Repo stars

ACL 2024

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Yifan Song, Da Yin, Xiang Yue, Jie Huang, Sujian Li, Bill Yuchen Lin

NeurIPS 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons

Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, Jianguo Zhang, Joanna Materzynska, Kun Qian, Kush Tiwary, Lester Miranda, Manan Dey, Minnie Liang, Mohammed Hamdy, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Shrestha Mohanty, Vipul Gupta, Vivek Sharma, Vu Minh Chien, Xuhui Zhou, Yizhi Li, Caiming Xiong, Luis Villa, Stella Biderman, Hanlin Li, Daphne Ippolito, Sara Hooker, Jad Kabbara, Sandy Pentland

Featured by New York Times

ACL 24 Findings

KPEval: Towards Fine-Grained Semantic-Based Keyphrase Evaluation

Di Wu, Da Yin, Kai‑Wei Chang

EMNLP 2023

Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

Da Yin*, Xiao Liu*, Fan Yin*, Ming Zhong*, Hritik Bansal, Jiawei Han, Kai‑Wei Chang

(* denotes equal contribution)
Pan-DL @ EMNLP

LEAF: Linguistically Enhanced Event Temporal Relation Framework

Stanley Lim, Da Yin, Nanyun Peng

🏆 Best Paper Award

CVPR 2023

GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods

Da Yin, Feng Gao, Govind Thattai, Michael Johnston, Kai‑Wei Chang

EMNLP 2022

GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models

Da Yin, Hritik Bansal, Masoud Monajatipoor, Liunian Harold Li, Kai-Wei Chang

Oral Presentation

EMNLP 2022

How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?

Hritik Bansal*, Da Yin*, Masoud Monajatipoor, Kai-Wei Chang

Oral Presentation

EMNLP 2022

Towards a Unified Multi‑Dimensional Evaluator for Text Generation

Ming Zhong, Yang Liu, Da Yin, Yuning Mao, Yizhu Jiao, Pengfei Liu, Chenguang Zhu, Heng Ji, Jiawei Han

Oral Presentation

ACL 2022

Things not Written in Text: Exploring Spatial Commonsense from Visual Signals

Xiao Liu, Da Yin, Yansong Feng, Dongyan Zhao

EMNLP 2021

Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning

Da Yin, Liunian Harold Li, Ziniu Hu, Nanyun Peng, Kai-Wei Chang

(* denotes equal contribution)

Oral Presentation

NAACL 2021

Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis

Xiao Liu*, Da Yin*, Yansong Feng, Yuting Wu, Dongyan Zhao

(* denotes equal contribution)

Oral Presentation

NAACL 2021

QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

Ming Zhong*, Da Yin*, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev

(* denotes equal contribution)

Oral Presentation

ACL 2020

SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics

Da Yin, Tao Meng, Kai‑Wei Chang

ACL 2020

What Does BERT with Vision Look At?

a.k.a. VisualBERT: A Simple and Performant Baseline for Vision and Language

Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, Kai-Wei Chang

GitHub Repo stars

CIKM 2019

Interactive Multi-Grained Joint Model for Targeted Sentiment Analysis

Da Yin*, Xiao Liu*, Xiaojun Wan

(* denotes equal contribution)

Oral Presentation