awesome-agentic-reasoning

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Awesome Agentic Reasoning

Skill by ara.so — AI Agent Skills collection.

This skill provides expertise in navigating and utilizing the Awesome Agentic Reasoning repository — a comprehensive, curated collection of research papers and resources on agentic reasoning for Large Language Models (LLMs). The repository is based on the survey paper "Agentic Reasoning for Large Language Models: A Survey" and organizes cutting-edge research into foundational reasoning, self-evolving systems, and multi-agent collaboration.

由ara.so提供的技能——AI Agent技能合集。

本技能提供了浏览和使用Awesome Agentic Reasoning知识库的专业指南，该知识库是一个全面、精心整理的大语言模型（LLMs）智能体推理研究论文与资源合集。它基于综述论文《Agentic Reasoning for Large Language Models: A Survey》构建，将前沿研究划分为基础推理、自我进化系统和多智能体协作三大板块。

What This Repository Provides

本知识库提供的内容

The Awesome Agentic Reasoning repository offers:

Categorized Research Papers: Organized by thematic areas including planning, tool use, search, self-evolution, multi-agent systems, and real-world applications
Benchmarks: Comprehensive lists of evaluation frameworks for agentic reasoning capabilities
Three-Layer Framework:
- Foundational Reasoning: Core single-agent abilities (planning, tool-use, search)
- Self-Evolving Reasoning: Adaptation through feedback, memory, and learning
- Collective Reasoning: Multi-agent coordination and collaborative intelligence
Application Domains: Math/coding agents, scientific discovery, embodied agents, healthcare, web exploration
Survey Materials: Slides and the comprehensive survey paper

Awesome Agentic Reasoning知识库包含：

分类研究论文：按规划、工具使用、搜索、自我进化、多智能体系统及实际应用等主题领域分类
基准测试：智能体推理能力评估框架的完整列表
三层框架：
- 基础推理：核心单智能体能力（规划、工具使用、搜索）
- 自我进化推理：通过反馈、记忆与学习实现自适应
- 集体推理：多智能体协调与协作智能
应用领域：数学/代码智能体、科学发现、具身智能体、医疗健康、网页探索
综述资料：演示幻灯片及完整综述论文

Repository Structure

知识库结构

Awesome-Agentic-Reasoning/
├── README.md                          # Main curated list
├── CONTRIBUTING.md                    # Contribution guidelines
├── materials/                         # Survey slides and materials
│   └── Agentic Reasoning Survey Talk.pdf
└── figs/                             # Framework diagrams
    ├── overview.png
    └── planning.png

Awesome-Agentic-Reasoning/
├── README.md                          # 主整理列表
├── CONTRIBUTING.md                    # 贡献指南
├── materials/                         # 综述幻灯片及资料
│   └── Agentic Reasoning Survey Talk.pdf
└── figs/                             # 框架示意图
    ├── overview.png
    └── planning.png

Navigating the Repository

浏览知识库

Main Categories

主要分类

The repository organizes papers into three primary layers:

知识库将论文分为三个核心层级：

1. Foundational Agentic Reasoning

1. 基础智能体推理

Planning Reasoning:

In-context Planning (workflow design, tree search)
Post-training Planning (supervised fine-tuning, reinforcement learning)

Tool-Use Optimization:

In-context Tool-Use (API orchestration, workflow design)
Post-training Tool-Use (supervised learning, RL fine-tuning)

Agentic Search:

In-context Search (web navigation, knowledge retrieval)
Post-training Search (RL optimization)

规划推理：

上下文内规划（工作流设计、树搜索）
训练后规划（监督微调、强化学习）

工具使用优化：

上下文内工具使用（API编排、工作流设计）
训练后工具使用（监督学习、RL微调）

智能体搜索：

上下文内搜索（网页导航、知识检索）
训练后搜索（RL优化）

2. Self-Evolving Agentic Reasoning

2. 自我进化智能体推理

Agentic Feedback Mechanisms: Self-reflection, critique, and iterative refinement
Agentic Memory: Short-term and long-term memory systems
Evolving Foundational Capabilities: Continuous improvement of planning, tool-use, and search

智能体反馈机制：自我反思、批判与迭代优化
智能体记忆：短期与长期记忆系统
基础能力进化：规划、工具使用与搜索能力的持续提升

3. Collective Multi-Agent Reasoning

3. 集体多智能体推理

Role Taxonomy: Debate, collaboration, hierarchical structures
Collaboration Patterns: Division of labor, coordination strategies
Multi-Agent Memory and Evolution: Shared knowledge, collective learning

角色分类：辩论、协作、层级结构
协作模式：分工、协调策略
多智能体记忆与进化：共享知识、集体学习

Applications

应用场景

The repository covers real-world applications:

💻 Math Exploration & Coding Agents
🔬 Scientific Discovery Agents
🤖 Embodied Agents
🏥 Healthcare & Medicine Agents
🌐 Autonomous Web Exploration & Research Agents

知识库涵盖以下实际应用：

💻 数学探索与代码智能体
🔬 科学发现智能体
🤖 具身智能体
🏥 医疗健康智能体
🌐 自主网页探索与研究智能体

Benchmarks

基准测试

Organized by:

Core Mechanisms: Tool Use, Search, Memory & Planning, Multi-Agent Systems
Application Domains: Embodied, Scientific Discovery, Medical, Web, General Tool-Use

按以下维度分类：

核心机制：工具使用、搜索、记忆与规划、多智能体系统
应用领域：具身、科学发现、医疗、网页、通用工具使用

Usage Patterns

使用模式

Finding Papers on Specific Topics

查找特定主题的论文

Example 1: Finding Planning Papers

Navigate to the Planning Reasoning section to find papers on:

Workflow design approaches (ReAct, ReWOO, Plan-and-Solve)
Tree search methods (Tree of Thoughts, MCTS-based approaches)
Post-training planning optimization

Example 2: Multi-Agent System Research

The Collective Multi-Agent Reasoning section includes:

Role specialization papers
Collaboration frameworks
Multi-agent memory systems

示例1：查找规划相关论文

导航至“规划推理”板块，可找到以下主题的论文：

工作流设计方法（ReAct、ReWOO、Plan-and-Solve）
树搜索方法（Tree of Thoughts、基于MCTS的方法）
训练后规划优化

示例2：多智能体系统研究

“集体多智能体推理”板块包含：

角色专业化论文
协作框架
多智能体记忆系统

Exploring Application Domains

探索应用领域

Example: Embodied Agent Research

Check the Applications > Embodied Agents section
Cross-reference with Benchmarks > Embodied Agents for evaluation frameworks
Review foundational papers on planning and tool-use that apply to embodied settings

示例：具身智能体研究

查看应用 > 具身智能体板块
结合基准测试 > 具身智能体板块的评估框架进行交叉参考
回顾适用于具身场景的规划与工具使用基础论文

Finding Benchmarks

查找基准测试

Example: Evaluating Tool-Use Capabilities

markdown

undefined

示例：评估工具使用能力

markdown

undefined

Tool Use Benchmarks

工具使用基准测试

Navigate to: Benchmarks > Core Mechanisms > Tool Use

Key benchmarks include:

API-Bank: API selection and execution
ToolBench: Multi-tool orchestration
T-Eval: Tool learning evaluation

undefined

导航至：基准测试 > 核心机制 > 工具使用

关键基准测试包括：

API-Bank：API选择与执行
ToolBench：多工具编排
T-Eval：工具学习评估

undefined

Contributing to the Repository

贡献知识库

Adding New Papers

添加新论文

Create a pull request with papers organized by category:

markdown

| [Paper Title](https://arxiv.org/abs/XXXX.XXXXX) | Conference/Year |

Guidelines:

Place papers in the appropriate thematic section
Follow the existing table format
Include the full arXiv link or conference proceedings URL
Add the publication year or venue

创建Pull Request，将论文按分类整理：

markdown

| [论文标题](https://arxiv.org/abs/XXXX.XXXXX) | 会议/年份 |

指南：

将论文放置在合适的主题板块
遵循现有表格格式
包含完整的arXiv链接或会议论文集URL
添加出版年份或会议地点

Suggesting Resources

建议资源

Open an issue to suggest:

New paper categories
Additional benchmarks
Application domains not yet covered
Survey materials or tutorials

Contact:

Email: twei10@illinois.edu, twli@illinois.edu, liu326@illinois.edu
GitHub Issues: For suggestions and discussions

提交Issue以建议：

新的论文分类
额外的基准测试
尚未覆盖的应用领域
综述资料或教程

联系方式：

邮箱：twei10@illinois.edu, twli@illinois.edu, liu326@illinois.edu
GitHub Issues：用于建议与讨论

Key Research Paradigms

核心研究范式

In-Context Reasoning vs. Post-Training Reasoning

上下文内推理 vs 训练后推理

The repository distinguishes between two optimization approaches:

In-Context Reasoning:

Test-time scaling through structured orchestration
Adaptive workflows without parameter updates
Examples: ReAct, Tree of Thoughts, Chain-of-Thought prompting

Post-Training Reasoning:

Behavior optimization via RL and supervised fine-tuning
Parameter updates to internalize reasoning strategies
Examples: RLHF for tool-use, Q-learning for planning

知识库区分了两种优化方法：

上下文内推理：

通过结构化编排实现测试时扩展
无需参数更新的自适应工作流
示例：ReAct、Tree of Thoughts、Chain-of-Thought提示词

训练后推理：

通过RL与监督微调优化行为
更新参数以内化推理策略
示例：用于工具使用的RLHF、用于规划的Q-learning

Environmental Dynamics

环境动态性

Papers are organized by the environmental setting:

Static environments: Fixed tool sets, deterministic outcomes
Dynamic environments: Feedback loops, adaptation requirements
Multi-agent environments: Coordination, communication, emergent behavior

论文按环境场景分类：

静态环境：固定工具集、确定性结果
动态环境：反馈循环、自适应需求
多智能体环境：协调、通信、涌现行为

Working with Survey Materials

使用综述资料

Accessing the Survey Paper

获取综述论文

The foundational survey is available at:

arXiv: https://arxiv.org/abs/2601.12538
HuggingFace Papers: https://huggingface.co/papers/2601.12538

基础综述论文可通过以下渠道获取：

arXiv: https://arxiv.org/abs/2601.12538
HuggingFace Papers: https://huggingface.co/papers/2601.12538

Using the Slides

使用演示幻灯片

Presentation materials are in

materials/Agentic Reasoning Survey Talk.pdf

Framework overview
Key insights from each reasoning layer
Application case studies
Future research directions

演示资料位于

materials/Agentic Reasoning Survey Talk.pdf

：

框架概述
各推理层级的核心见解
应用案例研究
未来研究方向

Common Patterns

常见模式

Building a Research Bibliography

构建研究参考文献目录

Pattern: Comprehensive Literature Review

python

undefined

模式：全面文献综述

python

undefined

Pseudo-code for extracting papers by category

按分类提取论文的伪代码

categories = [ "Planning Reasoning", "Tool-Use Optimization", "Agentic Search", "Multi-Agent Systems" ]

papers_by_category = {}

for category in categories: # Navigate to README section papers = extract_papers_from_section(category) papers_by_category[category] = papers

categories = [ "Planning Reasoning", "Tool-Use Optimization", "Agentic Search", "Multi-Agent Systems" ]

papers_by_category = {}

for category in categories: # 导航至README板块 papers = extract_papers_from_section(category) papers_by_category[category] = papers

Generate BibTeX or reading list

生成BibTeX或阅读列表

undefined

undefined

Tracking New Research

追踪最新研究

Pattern: Monitoring Updates

The repository is actively maintained. To stay current:

Watch the repository for updates
Check the News section in README for announcements
Review recent commits for newly added papers
Subscribe to GitHub notifications

模式：监控更新

本知识库持续维护。如需保持同步：

关注知识库以获取更新
查看README中的“新闻”板块获取公告
查看最近的提交记录以获取新增论文
订阅GitHub通知

Cross-Referencing Applications and Benchmarks

交叉参考应用与基准测试

Pattern: Application-Specific Research

For a specific application domain:

markdown

1. Identify application section (e.g., "Healthcare & Medicine Agents")
2. Review papers in that section
3. Navigate to corresponding benchmark section
4. Check foundational techniques used (planning, tool-use, etc.)
5. Trace back to foundational reasoning sections for core methods

模式：特定应用领域研究

针对特定应用领域：

markdown

1. 确定应用板块（如“医疗健康智能体”）
2. 查看该板块的论文
3. 导航至对应的基准测试板块
4. 查看所使用的基础技术（规划、工具使用等）
5. 追溯至基础推理板块获取核心方法

Citation

引用

When using this repository in research or projects:

bibtex

@article{wei2026agentic,
  title={Agentic Reasoning for Large Language Models},
  author={Wei, Tianxin and Li, Ting-Wei and Liu, Zhining and Ning, Xuying and Yang, Ze and Zou, Jiaru and Zeng, Zhichen and Qiu, Ruizhong and Lin, Xiao and Fu, Dongqi and others},
  journal={arXiv preprint arXiv:2601.12538},
  year={2026}
}

在研究或项目中使用本知识库时，请引用：

bibtex

@article{wei2026agentic,
  title={Agentic Reasoning for Large Language Models},
  author={Wei, Tianxin and Li, Ting-Wei and Liu, Zhining and Ning, Xuying and Yang, Ze and Zou, Jiaru and Zeng, Zhichen and Qiu, Ruizhong and Lin, Xiao and Fu, Dongqi and others},
  journal={arXiv preprint arXiv:2601.12538},
  year={2026}
}

Integration with Development Workflows

与开发工作流集成

For Researchers

针对研究人员

Literature Review Workflow:

Clone the repository for offline access
Use the categorized structure to identify relevant papers
Cross-reference applications with foundational techniques
Export citations for reference management tools

文献综述工作流：

克隆知识库以离线访问
使用分类结构识别相关论文
交叉参考应用与基础技术
导出引用至参考文献管理工具

For Practitioners

针对从业者

Implementation Workflow:

Identify your application domain (e.g., web agents, coding)
Review application-specific papers and benchmarks
Trace foundational techniques (planning, tool-use)
Reference implementation papers for code patterns
Evaluate using suggested benchmarks

实现工作流：

确定你的应用领域（如网页智能体、代码）
查看特定应用领域的论文与基准测试
追溯基础技术（规划、工具使用）
参考实现论文获取代码模式
使用建议的基准测试进行评估

For Tool Builders

针对工具开发者

Benchmark Selection:

Determine core capability (planning, tool-use, search)
Navigate to corresponding benchmark section
Review evaluation frameworks and metrics
Compare agent performance across standard benchmarks

基准测试选择：

确定核心能力（规划、工具使用、搜索）
导航至对应的基准测试板块
查看评估框架与指标
在标准基准测试中比较智能体性能

Best Practices

最佳实践

Exploring New Topics

探索新主题

Start with the Overview: Read the survey paper introduction and framework diagram
Navigate by Layer: Begin with foundational reasoning before advanced topics
Cross-Reference: Link application papers back to foundational techniques
Check Benchmarks: Understand evaluation standards for each capability

从概述开始：阅读综述论文的引言与框架示意图
按层级导航：先了解基础推理，再深入高级主题
交叉参考：将应用论文与基础技术关联
查看基准测试：了解各能力的评估标准

Contributing Quality Additions

贡献高质量内容

Verify Relevance: Ensure papers fit the agentic reasoning scope
Check Duplicates: Search existing entries before adding
Provide Context: Include venue/year information
Follow Format: Maintain consistent table structure

验证相关性：确保论文符合智能体推理的范围
检查重复：添加前搜索现有条目
提供上下文：包含会议/年份信息
遵循格式：保持一致的表格结构

Staying Current

保持同步

Monitor Commits: The repository updates regularly with new papers
Check News Section: Major updates announced at the top of README
Watch Discussions: GitHub issues may highlight emerging trends
Follow Survey Updates: Authors plan continued improvements

监控提交记录：知识库定期更新新增论文
查看新闻板块：README顶部会发布重大更新公告
关注讨论：GitHub Issues可能会突出新兴趋势
跟进综述更新：作者计划持续改进

Troubleshooting

故障排除

Finding Specific Papers

查找特定论文

Issue: Can't locate a specific paper

Solution:

Use browser search (Ctrl+F / Cmd+F) on the README
Check multiple related sections (papers may fit several categories)
Review the benchmarks section for evaluation-focused papers
Check recent commits if it's a new publication

问题：无法找到特定论文

解决方案：

在README中使用浏览器搜索（Ctrl+F / Cmd+F）
检查多个相关板块（论文可能属于多个分类）
查看基准测试板块获取评估相关论文
如果是新发表的论文，查看最近的提交记录

Understanding Categories

理解分类

Issue: Unclear which section contains relevant papers

Solution:

Refer to the framework overview diagram
Read the category descriptions in the survey paper
Cross-reference with similar known papers
Check application sections if domain-specific

问题：不清楚相关论文属于哪个板块

解决方案：

参考框架概述示意图
阅读综述论文中的分类描述
与已知的类似论文交叉参考
如果是特定领域的论文，查看应用板块

Accessing Papers

获取论文

Issue: Links not working or papers behind paywalls

Solution:

Most papers link to arXiv versions (open access)
For conference papers, search on Google Scholar
Check author websites for preprints
Use institutional access for published versions

问题：链接失效或论文处于付费墙后

解决方案：

大多数论文链接到arXiv版本（开放获取）
对于会议论文，在Google Scholar上搜索
查看作者网站获取预印本
使用机构访问权限获取已发表版本

Related Resources

Quick Reference

快速参考

Category	Key Papers	Benchmarks
Planning	Tree of Thoughts, ReAct, Plan-and-Solve	PlanBench, BlocksWorld
Tool-Use	Gorilla, ToolLLM, HuggingGPT	API-Bank, ToolBench
Search	WebGPT, Agent-E, Mind2Web	WebArena, GAIA
Multi-Agent	ChatDev, AgentVerse, MetaGPT	MAgIC, AgentBench
Embodied	LM-Nav, PERIA, RT-1	CALVIN, MetaWorld
Scientific	FunSearch, AI Scientist	ScienceBench

This skill enables AI coding agents to effectively navigate and utilize the Awesome Agentic Reasoning repository, helping developers access cutting-edge research on LLM-based agents, understand agentic reasoning frameworks, and apply state-of-the-art techniques to their projects.

分类	核心论文	基准测试
规划	Tree of Thoughts、ReAct、Plan-and-Solve	PlanBench、BlocksWorld
工具使用	Gorilla、ToolLLM、HuggingGPT	API-Bank、ToolBench
搜索	WebGPT、Agent-E、Mind2Web	WebArena、GAIA
多智能体	ChatDev、AgentVerse、MetaGPT	MAgIC、AgentBench
具身	LM-Nav、PERIA、RT-1	CALVIN、MetaWorld
科学研究	FunSearch、AI Scientist	ScienceBench

本技能使AI编码Agent能够高效浏览和使用Awesome Agentic Reasoning知识库，帮助开发者获取基于LLM的Agent前沿研究，理解智能体推理框架，并将最先进的技术应用到项目中。