2026

APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution
APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

Kun Chen, Qingchao Kong#, Feifei Zhao, Wenji Mao

ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2026 Under Review

We propose APEX-Searcher, a framework that augments LLMs' search capabilities through agentic planning and execution for improved information retrieval.我们提出了APEX-Searcher,一个通过智能体规划与执行来增强大语言模型搜索能力的框架,以改善信息检索效果。

APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

Kun Chen, Qingchao Kong#, Feifei Zhao, Wenji Mao

ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2026 Under Review

We propose APEX-Searcher, a framework that augments LLMs' search capabilities through agentic planning and execution for improved information retrieval.我们提出了APEX-Searcher,一个通过智能体规划与执行来增强大语言模型搜索能力的框架,以改善信息检索效果。

Flexible Entropy Control in RLVR with Gradient-Preserving Perspective
Flexible Entropy Control in RLVR with Gradient-Preserving Perspective

Kun Chen, Peng Shi#, Fanfan Liu, Haibo Qiu, Zhixiong Zeng, Siqi Yang, Wenji Mao#

International Conference on Machine Learning (ICML) 2026 Under Review

From the perspective of gradient preservation, we propose an entropy increase and decrease regulation mechanism, as well as three strategies for entropy control in the training of Reinforcement Learning with Verifiable Rewards (RLVR).我们从梯度保持的视角出发,提出了一种熵增熵减的调节机制,并提出三种在可验证奖励强化学习(RLVR)训练中控制熵的策略。

Flexible Entropy Control in RLVR with Gradient-Preserving Perspective

Kun Chen, Peng Shi#, Fanfan Liu, Haibo Qiu, Zhixiong Zeng, Siqi Yang, Wenji Mao#

International Conference on Machine Learning (ICML) 2026 Under Review

From the perspective of gradient preservation, we propose an entropy increase and decrease regulation mechanism, as well as three strategies for entropy control in the training of Reinforcement Learning with Verifiable Rewards (RLVR).我们从梯度保持的视角出发,提出了一种熵增熵减的调节机制,并提出三种在可验证奖励强化学习(RLVR)训练中控制熵的策略。

SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start
SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start

Kun Chen*, Peng Shi*, Haibo Qiu, Zhixiong Zeng, Siqi Yang, Wenji Mao#, Lin Ma#

International Conference on Learning Representations (ICLR) 2026 Poster

We propose SPECS, a self-distilled preference-based cold start method that decouples multimodal learning to improve multimodal reasoning capabilities.我们提出了SPECS,一种基于自蒸馏偏好的冷启动方法,通过解耦多模态学习来提升多模态推理能力。

SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start

Kun Chen*, Peng Shi*, Haibo Qiu, Zhixiong Zeng, Siqi Yang, Wenji Mao#, Lin Ma#

International Conference on Learning Representations (ICLR) 2026 Poster

We propose SPECS, a self-distilled preference-based cold start method that decouples multimodal learning to improve multimodal reasoning capabilities.我们提出了SPECS,一种基于自蒸馏偏好的冷启动方法,通过解耦多模态学习来提升多模态推理能力。

2025

DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling
DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling

Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao#, Yongbin Li#

Annual Meeting of the Association for Computational Linguistics (ACL) 2025 Findings

We propose DEMO, a framework that reframes dialogue interaction through fine-grained element modeling for improved dialogue understanding and generation.我们提出了DEMO,一个通过细粒度元素建模来重构对话交互的框架,以提升对话理解与生成能力。

DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling

Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao#, Yongbin Li#

Annual Meeting of the Association for Computational Linguistics (ACL) 2025 Findings

We propose DEMO, a framework that reframes dialogue interaction through fine-grained element modeling for improved dialogue understanding and generation.我们提出了DEMO,一个通过细粒度元素建模来重构对话交互的框架,以提升对话理解与生成能力。


Marks:标注: * equal contribution共同一作 , # corresponding author通讯作者