2026

如何在大模型RL中灵活地控制熵
March 17, 2026 LLM RLVR Entropy
本文介绍我们近期在 RLVR 训练动态方面的一项研究。我们从梯度保留裁剪(Gradient-Preserving Clipping)的理论视角出发,提出了一套灵活的熵调节机制,并基于熵增熵减的调节机制,实验了包括先熵增再熵减,熵减-熵增-熵减和动态衰减的三种熵控制策略,通过实验证明,该策略有效缓解了 GRPO 训练中的策略熵崩溃问题。
How to Flexibly Control Entropy in LLM Reinforcement Learning
March 17, 2026 LLM RLVR Entropy
This article introduces our recent research on RLVR training dynamics. Starting from a theoretical perspective of Gradient-Preserving Clipping, we propose a flexible entropy regulation mechanism...
Oh My Zsh Guide
March 13, 2026 shell zsh ai-gen
Oh My Zsh Guide This article introduces the basic usage of Oh My Zsh, including installation, theme configuration, recommendation and configuration of commonly used plugins.
Oh My Zsh 使用指南
March 13, 2026 shell zsh ai-gen
Oh My Zsh 使用指南 本文介绍 Oh My Zsh 的基本用法,包括安装、主题配置、常用插件的推荐与配置。