Darling
Our new work: Jointly Reinforcing Diversity and Quality in Language Model Generations is out! In this work, we studied how to make language models generate diverse outputs without sacrificing quality using online reinforcement learning.
Our new work: Jointly Reinforcing Diversity and Quality in Language Model Generations is out! In this work, we studied how to make language models generate diverse outputs without sacrificing quality using online reinforcement learning.