news | Tianjian Li

Apr 16, 2026	Our new work: Many-Tier Instruction Hierarchy in LLM Agents is out! In this work, we ask: what happens when LLM agents must resolve conflicting instructions across not just 3-4 privilege levels, but 12+?
Mar 20, 2026	Our new work: Reasoning over mathematical objects: on-policy reward modeling and test time aggregation is out! In this work we 1) built and released training data for deriving mathematical objects; 2) show that on-policy RL with strong verifier boosts performance, and 3) on-policy training on parallel generation + verification further boosts the performance.
Mar 1, 2026	I will be returning to Meta AI Research (FAIR) at NYC as a research intern in summer 2026!
Sep 4, 2025	Our new work: Jointly Reinforcing Diversity and Quality in Language Model Generations is out! In this work, we studied how to make language models generate diverse outputs without sacrificing quality using online reinforcement learning.
May 1, 2025	SimpleMix is accepted to ICML 2025 !! In this work, we studied the interplay between on- and off-policy data in preference optimization.
Jan 23, 2025	3 papers are accepted to NAACL🎉, which includes my work on training on heavily imbalanced datasets, Jack’s work on making language models produce verbatim quotes from training data, and Yining’s work on evaluating the creativity of language models on code generation. I am super grateful to my wonderful co-authors!
Dec 11, 2024	I will be joining Meta AI Research (FAIR) as a research intern in summer 2025!
Dec 6, 2024	New blog post on why does the chosen and the rejected log-probs is decreased during DPO and why it is to some extent beneficial for alignment.
Oct 4, 2024	New preprint on how to train on heavily imbalanced datasets!!
Apr 7, 2024	I will be staying at Johns Hopkins University for my PhD, working with Prof. Daniel Khashabi!
Jan 15, 2024	Error Norm Truncation has been accepted to ICLR 2024 (spotlight) !!
Nov 8, 2023	New blog post on latest advances on balanced training for Multilingual Machine Translation!
Oct 2, 2023	New preprint on truncating noisy data for training text generation models!!
Jul 31, 2023	New blog post on estimating data utility.
Jun 12, 2023	One paper accepted to ACL 2023 (Findings)