Tianjian Li

Center for Language and Speech Processing, Johns Hopkins University

prof_pic.jpg

Hi👋, I’m Tianjian! I’m a PhD student in Computer Science at Johns Hopkins University, proudly advised by Prof. Daniel Khashabi. Previously, I completed my Master’s degree in Computer Science at JHU, where I had the privilege of working closely with my wonderful advisors, Kenton Murray and Philipp Koehn. Before that, I was an undergraduate at New York University.

My research lies at the intersection between machine learning and natural language processing.

I prefer solutions that are simple, generalizable, and theoretically sound.

If you have anything to share with me, please feel free to contact me through my email: tli104 at jhu.edu

news

Dec 11, 2024 I will be joining Meta AI Research (FAIR) as a research intern in summer 2025!
Dec 6, 2024 New blog post on why does the chosen and the rejected log-probs is decreased during DPO and why it is to some extent beneficial for alignment.
Oct 4, 2024 New preprint on how to train on heavily imbalanced datasets!!
Apr 7, 2024 I will be staying at Johns Hopkins University for my PhD, working with Prof. Daniel Khashabi!
Jan 15, 2024 Error Norm Truncation has been accepted to ICLR 2024 (spotlight) !!

selected publications

  1. preprint
    Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
    Tianjian Li, Haoran Xu, Weiting Tan, and 2 more authors
    2024
  2. preprint
    Benchmarking Language Model Creativity: A Case Study on Code Generation
    Yining Lu, Dixuan Wang, Tianjian Li, and 2 more authors
    2024
  3. preprint
    Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
    Jingyu Zhang, Marc Marone, Tianjian Li, and 2 more authors
    2024
  4. ICLR
    Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
    Tianjian Li, Haoran Xu, Philipp Koehn, and 2 more authors
    In The Twelfth International Conference on Learning Representations (ICLR)
    (Spotlight - Top 5%), 2024
  5. ACL
    Why Does Zero-shot Cross-lingual Generation Fail? An Explaination and A Solution
    Tianjian Li, and Kenton Murray
    In Proceedings of the 2023 Annual Meeting of the Association for Computational Linguistics (ACL Findings), Jul 2023