Da Yin

SEA & LAW @ NeurIPS 2025

LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

Da Yin*†‡, Yiming Wang*†, Yuedong Cui*, Ruichen Zheng*, Zhiqian Li, Zongyu Lin, Di Wu, Xueqing Wu, Chenchen Ye, Yu Zhou, Kai-Wei Chang‡

(* denotes co-first authors, † denotes co-leaders (alphabetical order), ‡ denotes co-advisors)

Spotlight at NeurIPS 2025 @ LAW Workshop

📄 Paper 💻 GitHub 🌐 Website 🤗 Huggingface

ICLR 2026

RefTool: Enhancing Model Reasoning with Reference-Guided Tool Creation

Xiao Liu, Da Yin, Zirui Wu, Yansong Feng

📄 Paper 💻 GitHub 🌐 Website

NeurIPS 2025 Datasets & Benchmarks

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Yining Hong, Rui Sun, Bingxuan Li, Xingcheng Yao, Maxine Wu, Alexander Chien, Da Yin, Ying Nian Wu, Zhecan James Wang, Kai-Wei Chang

Spotlight (2.8%)

📄 Paper 💻 GitHub

Computational Linguistics 2025 (Presented at ACL 2025)

Eliciting and Improving the Causal Reasoning Abilities of Large Language Models with Conditional Statements

Xiao Liu, Da Yin, Chen Zhang, Yansong Feng, Dongyan Zhao

📄 Paper

ICCV 2025

Verbalized Representation Learning for Interpretable Few-Shot Generalization

Cheng-Fu Yang, Da Yin, Wenbo Hu, Nanyun Peng, Bolei Zhou, Kai-Wei Chang

📄 Paper 💻 GitHub

ICML 2025

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

Zongyu Lin*, Yao Tang*, Xingcheng Yao*, Da Yin*, Ziniu Hu, Yizhou Sun, Kai-Wei Chang

(* denotes equal contribution)

📄 Paper 💻 GitHub 🌐 Website

NAACL 2025

Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?

Xuan He*, Da Yin*, Nanyun Peng

(* denotes equal contribution)

Oral Presentation

📄 Paper 💻 GitHub

CVPR 2025

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Xueqing Wu, Yuheng Ding, Bingxuan Li, Pan Lu, Da Yin, Kai-Wei Chang, Nanyun Peng

📄 Paper 💻 GitHub 🌐 Website

ICLR 2025

Bridging the Data Provenance Gap Across Text, Speech and Video

Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi Li, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N Lee, Campbell S Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester JV Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara

📄 Paper

NeurIPS 2024

SafeWorld: Geo‑Diverse Safety Alignment

Da Yin*, Haoyi Qiu*, Kung‑Hsiang Huang, Kai‑Wei Chang, Nanyun Peng

(* denotes equal contribution)

📄 Paper 💻 Code

ACL 2024

🪄 Agent Lumos: Unified and Modular Training for Open-Source Language Agents

Da Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin

Featured by Marktechpost

📄 Paper 💻 Code 🌐 Website

ACL 2024

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Yifan Song, Da Yin, Xiang Yue, Jie Huang, Sujian Li, Bill Yuchen Lin

📄 Paper 💻 Code

NeurIPS 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons

Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, Jianguo Zhang, Joanna Materzynska, Kun Qian, Kush Tiwary, Lester Miranda, Manan Dey, Minnie Liang, Mohammed Hamdy, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Shrestha Mohanty, Vipul Gupta, Vivek Sharma, Vu Minh Chien, Xuhui Zhou, Yizhi Li, Caiming Xiong, Luis Villa, Stella Biderman, Hanlin Li, Daphne Ippolito, Sara Hooker, Jad Kabbara, Sandy Pentland

Featured by New York Times

📄 Paper 💻 Code 🌐 Website

ACL 24 Findings

KPEval: Towards Fine-Grained Semantic-Based Keyphrase Evaluation

Di Wu, Da Yin, Kai‑Wei Chang

📄 Paper 💻 Code

EMNLP 2023

Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

Da Yin*, Xiao Liu*, Fan Yin*, Ming Zhong*, Hritik Bansal, Jiawei Han, Kai‑Wei Chang

(* denotes equal contribution)

📄 Paper 💻 GitHub 🌐 Website

Pan-DL @ EMNLP

LEAF: Linguistically Enhanced Event Temporal Relation Framework

Stanley Lim, Da Yin, Nanyun Peng

🏆 Best Paper Award

📄 Paper

CVPR 2023

GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods

Da Yin, Feng Gao, Govind Thattai, Michael Johnston, Kai‑Wei Chang

📄 Paper

ACL 2020

What Does BERT with Vision Look At?

a.k.a. VisualBERT: A Simple and Performant Baseline for Vision and Language

Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, Kai-Wei Chang

📄 Paper 💻 GitHub

CIKM 2019

Interactive Multi-Grained Joint Model for Targeted Sentiment Analysis

Da Yin*, Xiao Liu*, Xiaojun Wan

(* denotes equal contribution)

Oral Presentation

📄 Paper

🎓 Education

📝 Full publication list:

LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

(* denotes co-first authors, † denotes co-leaders (alphabetical order), ‡ denotes co-advisors)

Spotlight at NeurIPS 2025 @ LAW Workshop

RefTool: Enhancing Model Reasoning with Reference-Guided Tool Creation

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Spotlight (2.8%)

Eliciting and Improving the Causal Reasoning Abilities of Large Language Models with Conditional Statements

Verbalized Representation Learning for Interpretable Few-Shot Generalization

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

(* denotes equal contribution)

Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?

(* denotes equal contribution)

Oral Presentation

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Bridging the Data Provenance Gap Across Text, Speech and Video

SafeWorld: Geo‑Diverse Safety Alignment

(* denotes equal contribution)

🪄 Agent Lumos: Unified and Modular Training for Open-Source Language Agents

Featured by Marktechpost

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Consent in Crisis: The Rapid Decline of the AI Data Commons

Featured by New York Times

KPEval: Towards Fine-Grained Semantic-Based Keyphrase Evaluation

Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

(* denotes equal contribution)

LEAF: Linguistically Enhanced Event Temporal Relation Framework

🏆 Best Paper Award

GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods

GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models

Oral Presentation

How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?

Oral Presentation

Towards a Unified Multi‑Dimensional Evaluator for Text Generation

Oral Presentation

Things not Written in Text: Exploring Spatial Commonsense from Visual Signals

Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning

(* denotes equal contribution)

Oral Presentation

Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis

(* denotes equal contribution)

Oral Presentation

QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

(* denotes equal contribution)

Oral Presentation

SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics

What Does BERT with Vision Look At?

Interactive Multi-Grained Joint Model for Targeted Sentiment Analysis

(* denotes equal contribution)

Oral Presentation