awesome-agentic-reasoning

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Awesome Agentic Reasoning

Awesome Agentic Reasoning

Skill by ara.so — AI Agent Skills collection.
This skill provides expertise in navigating and utilizing the Awesome Agentic Reasoning repository — a comprehensive, curated collection of research papers and resources on agentic reasoning for Large Language Models (LLMs). The repository is based on the survey paper "Agentic Reasoning for Large Language Models: A Survey" and organizes cutting-edge research into foundational reasoning, self-evolving systems, and multi-agent collaboration.
ara.so提供的技能——AI Agent技能合集。
本技能提供了浏览和使用Awesome Agentic Reasoning知识库的专业指南,该知识库是一个全面、精心整理的大语言模型(LLMs)智能体推理研究论文与资源合集。它基于综述论文《Agentic Reasoning for Large Language Models: A Survey》构建,将前沿研究划分为基础推理、自我进化系统和多智能体协作三大板块。

What This Repository Provides

本知识库提供的内容

The Awesome Agentic Reasoning repository offers:
  • Categorized Research Papers: Organized by thematic areas including planning, tool use, search, self-evolution, multi-agent systems, and real-world applications
  • Benchmarks: Comprehensive lists of evaluation frameworks for agentic reasoning capabilities
  • Three-Layer Framework:
    • Foundational Reasoning: Core single-agent abilities (planning, tool-use, search)
    • Self-Evolving Reasoning: Adaptation through feedback, memory, and learning
    • Collective Reasoning: Multi-agent coordination and collaborative intelligence
  • Application Domains: Math/coding agents, scientific discovery, embodied agents, healthcare, web exploration
  • Survey Materials: Slides and the comprehensive survey paper
Awesome Agentic Reasoning知识库包含:
  • 分类研究论文:按规划、工具使用、搜索、自我进化、多智能体系统及实际应用等主题领域分类
  • 基准测试:智能体推理能力评估框架的完整列表
  • 三层框架
    • 基础推理:核心单智能体能力(规划、工具使用、搜索)
    • 自我进化推理:通过反馈、记忆与学习实现自适应
    • 集体推理:多智能体协调与协作智能
  • 应用领域:数学/代码智能体、科学发现、具身智能体、医疗健康、网页探索
  • 综述资料:演示幻灯片及完整综述论文

Repository Structure

知识库结构

Awesome-Agentic-Reasoning/
├── README.md                          # Main curated list
├── CONTRIBUTING.md                    # Contribution guidelines
├── materials/                         # Survey slides and materials
│   └── Agentic Reasoning Survey Talk.pdf
└── figs/                             # Framework diagrams
    ├── overview.png
    └── planning.png
Awesome-Agentic-Reasoning/
├── README.md                          # 主整理列表
├── CONTRIBUTING.md                    # 贡献指南
├── materials/                         # 综述幻灯片及资料
│   └── Agentic Reasoning Survey Talk.pdf
└── figs/                             # 框架示意图
    ├── overview.png
    └── planning.png

Navigating the Repository

浏览知识库

Main Categories

主要分类

The repository organizes papers into three primary layers:
知识库将论文分为三个核心层级:

1. Foundational Agentic Reasoning

1. 基础智能体推理

Planning Reasoning:
  • In-context Planning (workflow design, tree search)
  • Post-training Planning (supervised fine-tuning, reinforcement learning)
Tool-Use Optimization:
  • In-context Tool-Use (API orchestration, workflow design)
  • Post-training Tool-Use (supervised learning, RL fine-tuning)
Agentic Search:
  • In-context Search (web navigation, knowledge retrieval)
  • Post-training Search (RL optimization)
规划推理
  • 上下文内规划(工作流设计、树搜索)
  • 训练后规划(监督微调、强化学习)
工具使用优化
  • 上下文内工具使用(API编排、工作流设计)
  • 训练后工具使用(监督学习、RL微调)
智能体搜索
  • 上下文内搜索(网页导航、知识检索)
  • 训练后搜索(RL优化)

2. Self-Evolving Agentic Reasoning

2. 自我进化智能体推理

  • Agentic Feedback Mechanisms: Self-reflection, critique, and iterative refinement
  • Agentic Memory: Short-term and long-term memory systems
  • Evolving Foundational Capabilities: Continuous improvement of planning, tool-use, and search
  • 智能体反馈机制:自我反思、批判与迭代优化
  • 智能体记忆:短期与长期记忆系统
  • 基础能力进化:规划、工具使用与搜索能力的持续提升

3. Collective Multi-Agent Reasoning

3. 集体多智能体推理

  • Role Taxonomy: Debate, collaboration, hierarchical structures
  • Collaboration Patterns: Division of labor, coordination strategies
  • Multi-Agent Memory and Evolution: Shared knowledge, collective learning
  • 角色分类:辩论、协作、层级结构
  • 协作模式:分工、协调策略
  • 多智能体记忆与进化:共享知识、集体学习

Applications

应用场景

The repository covers real-world applications:
  • 💻 Math Exploration & Coding Agents
  • 🔬 Scientific Discovery Agents
  • 🤖 Embodied Agents
  • 🏥 Healthcare & Medicine Agents
  • 🌐 Autonomous Web Exploration & Research Agents
知识库涵盖以下实际应用:
  • 💻 数学探索与代码智能体
  • 🔬 科学发现智能体
  • 🤖 具身智能体
  • 🏥 医疗健康智能体
  • 🌐 自主网页探索与研究智能体

Benchmarks

基准测试

Organized by:
  • Core Mechanisms: Tool Use, Search, Memory & Planning, Multi-Agent Systems
  • Application Domains: Embodied, Scientific Discovery, Medical, Web, General Tool-Use
按以下维度分类:
  • 核心机制:工具使用、搜索、记忆与规划、多智能体系统
  • 应用领域:具身、科学发现、医疗、网页、通用工具使用

Usage Patterns

使用模式

Finding Papers on Specific Topics

查找特定主题的论文

Example 1: Finding Planning Papers
Navigate to the Planning Reasoning section to find papers on:
  • Workflow design approaches (ReAct, ReWOO, Plan-and-Solve)
  • Tree search methods (Tree of Thoughts, MCTS-based approaches)
  • Post-training planning optimization
Example 2: Multi-Agent System Research
The Collective Multi-Agent Reasoning section includes:
  • Role specialization papers
  • Collaboration frameworks
  • Multi-agent memory systems
示例1:查找规划相关论文
导航至“规划推理”板块,可找到以下主题的论文:
  • 工作流设计方法(ReAct、ReWOO、Plan-and-Solve)
  • 树搜索方法(Tree of Thoughts、基于MCTS的方法)
  • 训练后规划优化
示例2:多智能体系统研究
“集体多智能体推理”板块包含:
  • 角色专业化论文
  • 协作框架
  • 多智能体记忆系统

Exploring Application Domains

探索应用领域

Example: Embodied Agent Research
  1. Check the Applications > Embodied Agents section
  2. Cross-reference with Benchmarks > Embodied Agents for evaluation frameworks
  3. Review foundational papers on planning and tool-use that apply to embodied settings
示例:具身智能体研究
  1. 查看应用 > 具身智能体板块
  2. 结合基准测试 > 具身智能体板块的评估框架进行交叉参考
  3. 回顾适用于具身场景的规划与工具使用基础论文

Finding Benchmarks

查找基准测试

Example: Evaluating Tool-Use Capabilities
markdown
undefined
示例:评估工具使用能力
markdown
undefined

Tool Use Benchmarks

工具使用基准测试

Navigate to: Benchmarks > Core Mechanisms > Tool Use
Key benchmarks include:
  • API-Bank: API selection and execution
  • ToolBench: Multi-tool orchestration
  • T-Eval: Tool learning evaluation
undefined
导航至:基准测试 > 核心机制 > 工具使用
关键基准测试包括:
  • API-Bank:API选择与执行
  • ToolBench:多工具编排
  • T-Eval:工具学习评估
undefined

Contributing to the Repository

贡献知识库

Adding New Papers

添加新论文

Create a pull request with papers organized by category:
markdown
| [Paper Title](https://arxiv.org/abs/XXXX.XXXXX) | Conference/Year |
Guidelines:
  • Place papers in the appropriate thematic section
  • Follow the existing table format
  • Include the full arXiv link or conference proceedings URL
  • Add the publication year or venue
创建Pull Request,将论文按分类整理:
markdown
| [论文标题](https://arxiv.org/abs/XXXX.XXXXX) | 会议/年份 |
指南
  • 将论文放置在合适的主题板块
  • 遵循现有表格格式
  • 包含完整的arXiv链接或会议论文集URL
  • 添加出版年份或会议地点

Suggesting Resources

建议资源

Open an issue to suggest:
  • New paper categories
  • Additional benchmarks
  • Application domains not yet covered
  • Survey materials or tutorials
Contact:
  • Email: twei10@illinois.edu, twli@illinois.edu, liu326@illinois.edu
  • GitHub Issues: For suggestions and discussions
提交Issue以建议:
  • 新的论文分类
  • 额外的基准测试
  • 尚未覆盖的应用领域
  • 综述资料或教程
联系方式
  • 邮箱:twei10@illinois.edu, twli@illinois.edu, liu326@illinois.edu
  • GitHub Issues:用于建议与讨论

Key Research Paradigms

核心研究范式

In-Context Reasoning vs. Post-Training Reasoning

上下文内推理 vs 训练后推理

The repository distinguishes between two optimization approaches:
In-Context Reasoning:
  • Test-time scaling through structured orchestration
  • Adaptive workflows without parameter updates
  • Examples: ReAct, Tree of Thoughts, Chain-of-Thought prompting
Post-Training Reasoning:
  • Behavior optimization via RL and supervised fine-tuning
  • Parameter updates to internalize reasoning strategies
  • Examples: RLHF for tool-use, Q-learning for planning
知识库区分了两种优化方法:
上下文内推理
  • 通过结构化编排实现测试时扩展
  • 无需参数更新的自适应工作流
  • 示例:ReAct、Tree of Thoughts、Chain-of-Thought提示词
训练后推理
  • 通过RL与监督微调优化行为
  • 更新参数以内化推理策略
  • 示例:用于工具使用的RLHF、用于规划的Q-learning

Environmental Dynamics

环境动态性

Papers are organized by the environmental setting:
  • Static environments: Fixed tool sets, deterministic outcomes
  • Dynamic environments: Feedback loops, adaptation requirements
  • Multi-agent environments: Coordination, communication, emergent behavior
论文按环境场景分类:
  • 静态环境:固定工具集、确定性结果
  • 动态环境:反馈循环、自适应需求
  • 多智能体环境:协调、通信、涌现行为

Working with Survey Materials

使用综述资料

Accessing the Survey Paper

获取综述论文

The foundational survey is available at:
基础综述论文可通过以下渠道获取:

Using the Slides

使用演示幻灯片

Presentation materials are in
materials/Agentic Reasoning Survey Talk.pdf
:
  • Framework overview
  • Key insights from each reasoning layer
  • Application case studies
  • Future research directions
演示资料位于
materials/Agentic Reasoning Survey Talk.pdf
  • 框架概述
  • 各推理层级的核心见解
  • 应用案例研究
  • 未来研究方向

Common Patterns

常见模式

Building a Research Bibliography

构建研究参考文献目录

Pattern: Comprehensive Literature Review
python
undefined
模式:全面文献综述
python
undefined

Pseudo-code for extracting papers by category

按分类提取论文的伪代码

categories = [ "Planning Reasoning", "Tool-Use Optimization", "Agentic Search", "Multi-Agent Systems" ]
papers_by_category = {}
for category in categories: # Navigate to README section papers = extract_papers_from_section(category) papers_by_category[category] = papers
categories = [ "Planning Reasoning", "Tool-Use Optimization", "Agentic Search", "Multi-Agent Systems" ]
papers_by_category = {}
for category in categories: # 导航至README板块 papers = extract_papers_from_section(category) papers_by_category[category] = papers

Generate BibTeX or reading list

生成BibTeX或阅读列表

undefined
undefined

Tracking New Research

追踪最新研究

Pattern: Monitoring Updates
The repository is actively maintained. To stay current:
  1. Watch the repository for updates
  2. Check the News section in README for announcements
  3. Review recent commits for newly added papers
  4. Subscribe to GitHub notifications
模式:监控更新
本知识库持续维护。如需保持同步:
  1. 关注知识库以获取更新
  2. 查看README中的“新闻”板块获取公告
  3. 查看最近的提交记录以获取新增论文
  4. 订阅GitHub通知

Cross-Referencing Applications and Benchmarks

交叉参考应用与基准测试

Pattern: Application-Specific Research
For a specific application domain:
markdown
1. Identify application section (e.g., "Healthcare & Medicine Agents")
2. Review papers in that section
3. Navigate to corresponding benchmark section
4. Check foundational techniques used (planning, tool-use, etc.)
5. Trace back to foundational reasoning sections for core methods
模式:特定应用领域研究
针对特定应用领域:
markdown
1. 确定应用板块(如“医疗健康智能体”)
2. 查看该板块的论文
3. 导航至对应的基准测试板块
4. 查看所使用的基础技术(规划、工具使用等)
5. 追溯至基础推理板块获取核心方法

Citation

引用

When using this repository in research or projects:
bibtex
@article{wei2026agentic,
  title={Agentic Reasoning for Large Language Models},
  author={Wei, Tianxin and Li, Ting-Wei and Liu, Zhining and Ning, Xuying and Yang, Ze and Zou, Jiaru and Zeng, Zhichen and Qiu, Ruizhong and Lin, Xiao and Fu, Dongqi and others},
  journal={arXiv preprint arXiv:2601.12538},
  year={2026}
}
在研究或项目中使用本知识库时,请引用:
bibtex
@article{wei2026agentic,
  title={Agentic Reasoning for Large Language Models},
  author={Wei, Tianxin and Li, Ting-Wei and Liu, Zhining and Ning, Xuying and Yang, Ze and Zou, Jiaru and Zeng, Zhichen and Qiu, Ruizhong and Lin, Xiao and Fu, Dongqi and others},
  journal={arXiv preprint arXiv:2601.12538},
  year={2026}
}

Integration with Development Workflows

与开发工作流集成

For Researchers

针对研究人员

Literature Review Workflow:
  1. Clone the repository for offline access
  2. Use the categorized structure to identify relevant papers
  3. Cross-reference applications with foundational techniques
  4. Export citations for reference management tools
文献综述工作流
  1. 克隆知识库以离线访问
  2. 使用分类结构识别相关论文
  3. 交叉参考应用与基础技术
  4. 导出引用至参考文献管理工具

For Practitioners

针对从业者

Implementation Workflow:
  1. Identify your application domain (e.g., web agents, coding)
  2. Review application-specific papers and benchmarks
  3. Trace foundational techniques (planning, tool-use)
  4. Reference implementation papers for code patterns
  5. Evaluate using suggested benchmarks
实现工作流
  1. 确定你的应用领域(如网页智能体、代码)
  2. 查看特定应用领域的论文与基准测试
  3. 追溯基础技术(规划、工具使用)
  4. 参考实现论文获取代码模式
  5. 使用建议的基准测试进行评估

For Tool Builders

针对工具开发者

Benchmark Selection:
  1. Determine core capability (planning, tool-use, search)
  2. Navigate to corresponding benchmark section
  3. Review evaluation frameworks and metrics
  4. Compare agent performance across standard benchmarks
基准测试选择
  1. 确定核心能力(规划、工具使用、搜索)
  2. 导航至对应的基准测试板块
  3. 查看评估框架与指标
  4. 在标准基准测试中比较智能体性能

Best Practices

最佳实践

Exploring New Topics

探索新主题

  1. Start with the Overview: Read the survey paper introduction and framework diagram
  2. Navigate by Layer: Begin with foundational reasoning before advanced topics
  3. Cross-Reference: Link application papers back to foundational techniques
  4. Check Benchmarks: Understand evaluation standards for each capability
  1. 从概述开始:阅读综述论文的引言与框架示意图
  2. 按层级导航:先了解基础推理,再深入高级主题
  3. 交叉参考:将应用论文与基础技术关联
  4. 查看基准测试:了解各能力的评估标准

Contributing Quality Additions

贡献高质量内容

  1. Verify Relevance: Ensure papers fit the agentic reasoning scope
  2. Check Duplicates: Search existing entries before adding
  3. Provide Context: Include venue/year information
  4. Follow Format: Maintain consistent table structure
  1. 验证相关性:确保论文符合智能体推理的范围
  2. 检查重复:添加前搜索现有条目
  3. 提供上下文:包含会议/年份信息
  4. 遵循格式:保持一致的表格结构

Staying Current

保持同步

  1. Monitor Commits: The repository updates regularly with new papers
  2. Check News Section: Major updates announced at the top of README
  3. Watch Discussions: GitHub issues may highlight emerging trends
  4. Follow Survey Updates: Authors plan continued improvements
  1. 监控提交记录:知识库定期更新新增论文
  2. 查看新闻板块:README顶部会发布重大更新公告
  3. 关注讨论:GitHub Issues可能会突出新兴趋势
  4. 跟进综述更新:作者计划持续改进

Troubleshooting

故障排除

Finding Specific Papers

查找特定论文

Issue: Can't locate a specific paper
Solution:
  • Use browser search (Ctrl+F / Cmd+F) on the README
  • Check multiple related sections (papers may fit several categories)
  • Review the benchmarks section for evaluation-focused papers
  • Check recent commits if it's a new publication
问题:无法找到特定论文
解决方案
  • 在README中使用浏览器搜索(Ctrl+F / Cmd+F)
  • 检查多个相关板块(论文可能属于多个分类)
  • 查看基准测试板块获取评估相关论文
  • 如果是新发表的论文,查看最近的提交记录

Understanding Categories

理解分类

Issue: Unclear which section contains relevant papers
Solution:
  • Refer to the framework overview diagram
  • Read the category descriptions in the survey paper
  • Cross-reference with similar known papers
  • Check application sections if domain-specific
问题:不清楚相关论文属于哪个板块
解决方案
  • 参考框架概述示意图
  • 阅读综述论文中的分类描述
  • 与已知的类似论文交叉参考
  • 如果是特定领域的论文,查看应用板块

Accessing Papers

获取论文

Issue: Links not working or papers behind paywalls
Solution:
  • Most papers link to arXiv versions (open access)
  • For conference papers, search on Google Scholar
  • Check author websites for preprints
  • Use institutional access for published versions
问题:链接失效或论文处于付费墙后
解决方案
  • 大多数论文链接到arXiv版本(开放获取)
  • 对于会议论文,在Google Scholar上搜索
  • 查看作者网站获取预印本
  • 使用机构访问权限获取已发表版本

Related Resources

相关资源

Quick Reference

快速参考

CategoryKey PapersBenchmarks
PlanningTree of Thoughts, ReAct, Plan-and-SolvePlanBench, BlocksWorld
Tool-UseGorilla, ToolLLM, HuggingGPTAPI-Bank, ToolBench
SearchWebGPT, Agent-E, Mind2WebWebArena, GAIA
Multi-AgentChatDev, AgentVerse, MetaGPTMAgIC, AgentBench
EmbodiedLM-Nav, PERIA, RT-1CALVIN, MetaWorld
ScientificFunSearch, AI ScientistScienceBench
This skill enables AI coding agents to effectively navigate and utilize the Awesome Agentic Reasoning repository, helping developers access cutting-edge research on LLM-based agents, understand agentic reasoning frameworks, and apply state-of-the-art techniques to their projects.
分类核心论文基准测试
规划Tree of Thoughts、ReAct、Plan-and-SolvePlanBench、BlocksWorld
工具使用Gorilla、ToolLLM、HuggingGPTAPI-Bank、ToolBench
搜索WebGPT、Agent-E、Mind2WebWebArena、GAIA
多智能体ChatDev、AgentVerse、MetaGPTMAgIC、AgentBench
具身LM-Nav、PERIA、RT-1CALVIN、MetaWorld
科学研究FunSearch、AI ScientistScienceBench
本技能使AI编码Agent能够高效浏览和使用Awesome Agentic Reasoning知识库,帮助开发者获取基于LLM的Agent前沿研究,理解智能体推理框架,并将最先进的技术应用到项目中。