A New Fine-Tuning Approach for LLMs Using Evolution Strategies
A new research approach challenges reinforcement learning as the default for LLM post-training. We present the first successful use of evolution strategies to fine-tune the full parameter set of large language models.