New preprint on truncating noisy data for training text generation models!!