photo-agents-autonomous-llm

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Photo Agents Autonomous LLM Skill

Skill by ara.so — AI Agent Skills collection.

由ara.so开发的Skill — AI Agent技能集合。

Overview

概述

Photo Agents is a Python framework for building autonomous, self-evolving AI agents that ground their understanding in visual observations of the screen. Unlike traditional text-only agents, Photo Agents implements a perceive → reason → act cycle with a layered memory architecture inspired by biological cognition: vision input, bounded observations stored in layers (L1-L4), and skills the agent writes from real successes.

Key capabilities:

Multi-provider LLM routing (Anthropic Claude, OpenAI GPT, failover sessions)
Layered memory system (working/global/SOP/session archive)
Physical execution tools (file I/O, sandboxed code, browser automation via Chrome DevTools Protocol)
Multiple client interfaces (CLI, Streamlit web app, PyQt desktop, chat platform bots)
Self-evolving through reflection and skill generation

Photo Agents是一个Python框架，用于构建以屏幕视觉观测为理解基础的自主进化AI Agent。与传统纯文本Agent不同，Photo Agents实现了感知→推理→行动的循环，并采用受生物认知启发的分层记忆架构：视觉输入、分层存储的有限观测（L1-L4），以及Agent从实际成功案例中编写的技能。

核心功能：

多供应商LLM路由（Anthropic Claude、OpenAI GPT、故障转移会话）
分层记忆系统（工作/全局/SOP/会话归档）
物理执行工具（文件I/O、沙箱代码、基于Chrome DevTools Protocol的浏览器自动化）
多客户端界面（CLI、Streamlit网页应用、PyQt桌面端、聊天平台机器人）
通过反思和技能生成实现自主进化

Installation

安装

Basic Installation

基础安装

bash

pip install photoagents

bash

pip install photoagents

Full Installation with All Clients

包含所有客户端的完整安装

bash

pip install "photoagents[all]"

Requirements: Python 3.10+

bash

pip install "photoagents[all]"

要求： Python 3.10+

API Key Setup

API密钥配置

Photo Agents requires a license key validated against

https://photo-agents.com/v1/keys/validate

Get your key at: https://photo-agents.com/dashboard/keys
Configure it (choose one method):

Environment variable:

bash

export PHOTOAGENTS_API_KEY=pk_live_your_key_here

Config file (

~/.photoagents/config.json

json

{
  "api_key": "pk_live_your_key_here"
}

Interactive prompt: Run any command and it will prompt you to enter and save the key.

Photo Agents需要通过

https://photo-agents.com/v1/keys/validate

验证的许可证密钥。

获取密钥：https://photo-agents.com/dashboard/keys
配置密钥（选择一种方式）：

环境变量：

bash

export PHOTOAGENTS_API_KEY=pk_live_your_key_here

配置文件 (

~/.photoagents/config.json

)：

json

{
  "api_key": "pk_live_your_key_here"
}

交互式提示： 运行任意命令，系统会提示您输入并保存密钥。

LLM Provider Configuration

LLM供应商配置

Create a

credentials.py

file in your project root:

python

undefined

在项目根目录创建

credentials.py

文件：

python

undefined

credentials.py

from photoagents.config.keys_template import LLMConfig, ProviderConfig

Option 1: Anthropic Claude

选项1：Anthropic Claude

llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", # Use env var model="claude-3-5-sonnet-20241022" ) )

llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", # 使用环境变量 model="claude-3-5-sonnet-20241022" ) )

Option 2: OpenAI GPT

选项2：OpenAI GPT

llm_config = LLMConfig( primary=ProviderConfig( provider="openai", api_key="${OPENAI_API_KEY}", model="gpt-4o" ) )

Option 3: Failover configuration

选项3：故障转移配置

llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", model="claude-3-5-sonnet-20241022" ), fallback=ProviderConfig( provider="openai", api_key="${OPENAI_API_KEY}", model="gpt-4o" ) )


Or use JSON format (`credentials.json`):

```json
{
  "primary": {
    "provider": "anthropic",
    "api_key": "${ANTHROPIC_API_KEY}",
    "model": "claude-3-5-sonnet-20241022"
  },
  "fallback": {
    "provider": "openai",
    "api_key": "${OPENAI_API_KEY}",
    "model": "gpt-4o"
  }
}


或使用JSON格式（`credentials.json`）：

```json
{
  "primary": {
    "provider": "anthropic",
    "api_key": "${ANTHROPIC_API_KEY}",
    "model": "claude-3-5-sonnet-20241022"
  },
  "fallback": {
    "provider": "openai",
    "api_key": "${OPENAI_API_KEY}",
    "model": "gpt-4o"
  }
}

Core Usage Patterns

核心使用模式

1. Interactive CLI Mode

1. 交互式CLI模式

bash

undefined

bash

undefined

Start interactive REPL

启动交互式REPL

python -m photoagents

The agent will prompt for tasks and execute them

Agent会提示任务并执行

with vision-grounded reasoning

基于视觉锚定推理

undefined

undefined

2. One-Shot Task Execution

2. 单次任务执行

bash

undefined

bash

undefined

Execute a single task

执行单个任务

python -m photoagents --task my_analysis --input "Analyze the largest files in this directory"

python -m photoagents --task my_analysis --input "分析此目录中最大的文件"

With custom output path

指定自定义输出路径

python -m photoagents --task report --input "Generate system report" --output ./reports/

undefined

python -m photoagents --task report --input "生成系统报告" --output ./reports/

undefined

3. Reflection/Watchdog Mode

3. 反思/监控模式

bash

undefined

bash

undefined

Run with reflection scheduler (self-evolving)

启动反思调度器（自主进化）

python -m photoagents --reflect photoagents/evolution/scheduler.py

undefined

python -m photoagents --reflect photoagents/evolution/scheduler.py

undefined

4. Programmatic Agent Session

4. 程序化Agent会话

python

from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
from photoagents.config.keys_template import LLMConfig, ProviderConfig

python

from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
from photoagents.config.keys_template import LLMConfig, ProviderConfig

Configure LLM

配置LLM

llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", model="claude-3-5-sonnet-20241022" ) )

Create session

创建会话

session = LLMSession(llm_config)

Run agent loop

运行Agent循环

result = run_agent_session( task_name="file_analysis", user_input="Find and summarize all Python files in the current directory", session=session, max_turns=10 )

print(f"Final output: {result}")

undefined

result = run_agent_session( task_name="file_analysis", user_input="查找并总结当前目录下所有Python文件", session=session, max_turns=10 )

print(f"最终输出: {result}")

undefined

5. Custom Tool Integration

5. 自定义工具集成

python

from photoagents.core.tool_dispatcher import register_tool
from typing import Dict, Any

@register_tool
def custom_analysis_tool(data: str, options: Dict[str, Any]) -> str:
    """
    Custom tool for specialized analysis.
    
    Args:
        data: Input data to analyze
        options: Configuration options
        
    Returns:
        Analysis results
    """
    # Your custom logic here
    result = f"Analyzed: {data} with options {options}"
    return result

python

from photoagents.core.tool_dispatcher import register_tool
from typing import Dict, Any

@register_tool
def custom_analysis_tool(data: str, options: Dict[str, Any]) -> str:
    """
    用于专业分析的自定义工具。
    
    参数:
        data: 待分析的输入数据
        options: 配置选项
        
    返回:
        分析结果
    """
    # 自定义逻辑编写处
    result = f"已分析: {data}，选项为 {options}"
    return result

Tool is now available to the agent

该工具现在可被Agent调用

undefined

undefined

GUI Client Options

GUI客户端选项

Streamlit Web App + WebView

Streamlit网页应用 + WebView

bash

undefined

bash

undefined

Launch web interface with native window

启动带原生窗口的网页界面

pythonw -m photoagents.cli.launcher

undefined

pythonw -m photoagents.cli.launcher

undefined

Service Hub (Start/Stop Services)

服务中心（启动/停止服务）

bash

undefined

bash

undefined

Launch control hub

启动控制中心

pythonw -m photoagents.cli.hub

undefined

pythonw -m photoagents.cli.hub

undefined

Desktop PyQt Application

桌面PyQt应用

bash

python -m photoagents.clients.desktop_app

bash

python -m photoagents.clients.desktop_app

Desktop Companion

桌面助手

bash

pythonw -m photoagents.clients.companion_v2

bash

pythonw -m photoagents.clients.companion_v2

Chat Platform Bots

聊天平台机器人

bash

undefined

bash

undefined

python -m photoagents.clients.telegram_client

Feishu (Lark)

飞书（Lark）

python -m photoagents.clients.feishu_client

WeCom

企业微信

python -m photoagents.clients.wecom_client

DingTalk

钉钉

python -m photoagents.clients.dingtalk_client

QQ

python -m photoagents.clients.qq_client

undefined

python -m photoagents.clients.qq_client

undefined

Layered Memory System

分层记忆系统

Photo Agents uses a 4-layer memory architecture:

Photo Agents采用4层记忆架构：

L1: Working Memory

L1: 工作记忆

Short-term context for the current task (conversation turns, immediate observations).

当前任务的短期上下文（对话轮次、即时观测）。

L2: Global Memory

L2: 全局记忆

Long-term facts stored in

~/.photoagents/global_mem.txt

python

from photoagents.core.memory import add_global_fact, search_global_memory

长期事实存储于

~/.photoagents/global_mem.txt

。

python

from photoagents.core.memory import add_global_fact, search_global_memory

Add a fact

添加事实

add_global_fact("Project uses Python 3.11 and requires PostgreSQL 14+")

add_global_fact"项目使用Python 3.11，需要PostgreSQL 14+"

Search memory

搜索记忆

results = search_global_memory("database requirements")

undefined

results = search_global_memory"数据库要求"

undefined

L3: Skills & SOPs

L3: 技能与SOP

Standard Operating Procedures the agent writes from successful executions.

python

from photoagents.skills.skill_manager import save_skill, load_skill

Agent从成功执行案例中编写的标准操作流程。

python

from photoagents.skills.skill_manager import save_skill, load_skill

Save a new skill

保存新技能

save_skill( name="web_scraping_pattern", code=""" def scrape_structured_data(url: str) -> dict: # Implementation pass """, description="Reliable pattern for scraping structured web data" )

save_skill( name="web_scraping_pattern", code=""" def scrape_structured_data(url: str) -> dict: # 实现代码 pass """, description="用于抓取结构化网页数据的可靠模式" )

Load and use

加载并使用

skill = load_skill("web_scraping_pattern")

undefined

skill = load_skill"web_scraping_pattern"

undefined

L4: Session Archive

L4: 会话归档

Full raw session logs in

~/.photoagents/sessions/

完整原始会话日志存储于

~/.photoagents/sessions/

。

Browser Automation with CDP

基于CDP的浏览器自动化

Photo Agents includes Chrome DevTools Protocol integration for browser control:

python

from photoagents.web.cdp_bridge import CDPBridge

async def automate_browser():
    async with CDPBridge() as browser:
        # Navigate
        await browser.navigate("https://example.com")
        
        # Take screenshot
        screenshot = await browser.screenshot()
        
        # Execute JavaScript
        result = await browser.evaluate("document.title")
        
        # Click element
        await browser.click("button.submit")
        
        # Fill form
        await browser.type("input[name='query']", "search term")
        
    return result

Photo Agents集成了Chrome DevTools Protocol用于浏览器控制：

python

from photoagents.web.cdp_bridge import CDPBridge

async def automate_browser():
    async with CDPBridge() as browser:
        # 导航
        await browser.navigate"https://example.com"
        
        # 截图
        screenshot = await browser.screenshot()
        
        # 执行JavaScript
        result = await browser.evaluate"document.title"
        
        # 点击元素
        await browser.click"button.submit"
        
        # 填写表单
        await browser.type"input[name='query']", "搜索关键词"
        
    return result

Vision-Grounded Operations

视觉锚定操作

Screenshot Analysis

截图分析

python

from photoagents.skills.vision import analyze_screenshot

python

from photoagents.skills.vision import analyze_screenshot

Agent automatically captures and analyzes screen

Agent自动捕获并分析屏幕

analysis = analyze_screenshot( region=(0, 0, 1920, 1080), # x, y, width, height question="What UI elements are visible?" )

undefined

analysis = analyze_screenshot( region=(0, 0, 1920, 1080), # x, y, 宽度, 高度 question="可见哪些UI元素？" )

undefined

OCR Text Extraction

OCR文本提取

python

from photoagents.skills.ocr import extract_text_from_region

python

from photoagents.skills.ocr import extract_text_from_region

Extract text from screen region

提取屏幕区域内的文本

text = extract_text_from_region( x=100, y=200, width=500, height=300 )

undefined

text = extract_text_from_region( x=100, y=200, width=500, height=300 )

undefined

Sandboxed Code Execution

沙箱代码执行

python

from photoagents.core.sandbox import execute_code

python

from photoagents.core.sandbox import execute_code

Python execution

Python代码执行

result = execute_code( code=""" import json data = {"status": "success"} print(json.dumps(data)) """, language="python", timeout=30 )

PowerShell (Windows)

PowerShell（Windows）

ps_result = execute_code( code="Get-Process | Select-Object -First 5", language="powershell" )

Bash (Linux/Mac)

Bash（Linux/Mac）

bash_result = execute_code( code="ls -la | head -n 10", language="bash" )

undefined

bash_result = execute_code( code="ls -la | head -n 10", language="bash" )

undefined

File I/O Operations

文件I/O操作

python

from photoagents.core.file_ops import read_file, write_file, list_directory

python

from photoagents.core.file_ops import read_file, write_file, list_directory

Read file

读取文件

content = read_file("~/project/config.json")

content = read_file"~/project/config.json"

Write file

写入文件

write_file("~/output/report.txt", "Analysis complete\n")

write_file"~/output/report.txt", "分析完成\n"

List directory with filters

带过滤条件的目录列表

files = list_directory( path="~/project", pattern="*.py", recursive=True )

undefined

files = list_directory( path="~/project", pattern="*.py", recursive=True )

undefined

Observability with Langfuse

基于Langfuse的可观测性

python

from photoagents.integrations.langfuse_tracer import init_langfuse, trace_agent_step

python

from photoagents.integrations.langfuse_tracer import init_langfuse, trace_agent_step

Initialize

初始化

tracer = init_langfuse( public_key="${LANGFUSE_PUBLIC_KEY}", secret_key="${LANGFUSE_SECRET_KEY}", host="https://cloud.langfuse.com" )

Trace agent steps

追踪Agent步骤

with trace_agent_step("file_analysis", metadata={"task": "analyze_logs"}): # Agent operations here pass

undefined

with trace_agent_step"file_analysis", metadata={"task": "analyze_logs"}: # Agent操作编写处 pass

undefined

Configuration Files

配置文件

On-Disk State Locations

磁盘状态位置

Path	Purpose
`~/.photoagents/config.json`	API key + license validation cache
`~/.photoagents/global_mem.txt`	L2 long-term facts
`~/.photoagents/sessions/`	L4 raw session archives
`~/.photoagents/skill_index/`	Vector index for skill/SOP search
`~/.photoagents/temp/`	Per-task scratch (logs, intermediate output)

路径	用途
`~/.photoagents/config.json`	API密钥 + 许可证验证缓存
`~/.photoagents/global_mem.txt`	L2长期事实存储
`~/.photoagents/sessions/`	L4原始会话归档
`~/.photoagents/skill_index/`	技能/SOP搜索的向量索引
`~/.photoagents/temp/`	任务临时文件（日志、中间输出）

Custom System Prompt

自定义系统提示词

Override the default system prompt:

python

from photoagents.core.loop import run_agent_session

custom_prompt = """
You are a specialized data analysis agent.
Focus on: statistical analysis, visualization, and reporting.
Always verify data integrity before processing.
"""

result = run_agent_session(
    task_name="analysis",
    user_input="Analyze sales data",
    system_prompt_override=custom_prompt
)

覆盖默认系统提示词：

python

from photoagents.core.loop import run_agent_session

custom_prompt = """
你是一名专业的数据分析师Agent。
专注于：统计分析、可视化和报告。
处理前务必验证数据完整性。
"""

result = run_agent_session(
    task_name="analysis",
    user_input="分析销售数据",
    system_prompt_override=custom_prompt
)

Common Patterns

常见模式

Pattern 1: Autonomous Research Agent

模式1：自主研究Agent

python

from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession

def create_research_agent(topic: str):
    session = LLMSession.from_env()
    
    result = run_agent_session(
        task_name=f"research_{topic}",
        user_input=f"""
        Research {topic} and create a comprehensive report:
        1. Search for recent information
        2. Analyze credibility of sources
        3. Synthesize findings
        4. Save report with citations
        """,
        session=session,
        max_turns=50
    )
    
    return result

python

from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession

def create_research_agent(topic: str):
    session = LLMSession.from_env()
    
    result = run_agent_session(
        task_name=f"research_{topic}",
        user_input=f"""
        研究{topic}并生成全面报告：
        1. 搜索最新信息
        2. 分析来源可信度
        3. 整合研究结果
        4. 保存带引用的报告
        """,
        session=session,
        max_turns=50
    )
    
    return result

Use it

使用示例

report = create_research_agent("quantum computing advances 2026")

undefined

report = create_research_agent"2026年量子计算进展"

undefined

Pattern 2: Self-Evolving Monitor

模式2：自主进化监控器

python

undefined

python

undefined

monitor.py

from photoagents.evolution.scheduler import schedule_check

def check() -> bool: """ Watchdog function that triggers agent tasks. Return True to execute a task. """ import os import time

# Check if it's time to run daily backup
last_run = os.path.getmtime("~/.photoagents/last_backup")
if time.time() - last_run > 86400:  # 24 hours
    return True

return False

def get_task() -> str: """Return the task to execute when check() returns True.""" return "Backup all project files to ~/backups/ and verify integrity"

from photoagents.evolution.scheduler import schedule_check

def check() -> bool: """ 触发Agent任务的监控函数。返回True时执行任务。 """ import os import time

# 检查是否到每日备份时间
last_run = os.path.getmtime"~/.photoagents/last_backup"
if time.time() - last_run > 86400:  # 24小时
    return True

return False

def get_task() -> str: """当check()返回True时，返回要执行的任务。""" return "将所有项目文件备份到~/backups/并验证完整性"

Run with:

运行方式:

python -m photoagents --reflect monitor.py

undefined

undefined

Pattern 3: Multi-Step Workflow

模式3：多步骤工作流

python

from photoagents.core.loop import run_agent_session
from photoagents.core.memory import add_global_fact

def execute_workflow(project_path: str):
    # Step 1: Analyze codebase
    analysis = run_agent_session(
        task_name="code_analysis",
        user_input=f"Analyze Python code structure in {project_path}"
    )
    
    # Save insight to global memory
    add_global_fact(f"Project at {project_path}: {analysis}")
    
    # Step 2: Generate documentation
    docs = run_agent_session(
        task_name="generate_docs",
        user_input=f"Create API documentation for {project_path}"
    )
    
    # Step 3: Run tests
    tests = run_agent_session(
        task_name="run_tests",
        user_input=f"Execute test suite and report coverage"
    )
    
    return {
        "analysis": analysis,
        "documentation": docs,
        "tests": tests
    }

python

from photoagents.core.loop import run_agent_session
from photoagents.core.memory import add_global_fact

def execute_workflow(project_path: str):
    # 步骤1：分析代码库
    analysis = run_agent_session(
        task_name="code_analysis",
        user_input=f"分析{project_path}中的Python代码结构"
    )
    
    # 将洞察保存到全局记忆
    add_global_fact(f"{project_path}项目：{analysis}")
    
    # 步骤2：生成文档
    docs = run_agent_session(
        task_name="generate_docs",
        user_input=f"为{project_path}创建API文档"
    )
    
    # 步骤3：运行测试
    tests = run_agent_session(
        task_name="run_tests",
        user_input=f"执行测试套件并报告覆盖率"
    )
    
    return {
        "analysis": analysis,
        "documentation": docs,
        "tests": tests
    }

Troubleshooting

故障排查

API Key Issues

API密钥问题

Problem:

PhotoAgentsAuthError: Invalid or missing API key

Solution:

bash

undefined

问题：

PhotoAgentsAuthError: Invalid or missing API key

解决方案：

bash

undefined

Verify key is set

验证密钥已设置

echo $PHOTOAGENTS_API_KEY

Or check config file

或检查配置文件

cat ~/.photoagents/config.json

Clear cache if key was recently updated

如果密钥最近更新，清除缓存

rm ~/.photoagents/config.json

undefined

rm ~/.photoagents/config.json

undefined

LLM Provider Errors

LLM供应商错误

Problem:

LLM provider authentication failed

Solution:

bash

undefined

问题：

LLM provider authentication failed

解决方案：

bash

undefined

Verify environment variables are set

验证环境变量已设置

echo $ANTHROPIC_API_KEY echo $OPENAI_API_KEY

Test credentials.py is in correct location

验证credentials.py在正确位置

ls credentials.py

Check credentials.py syntax

检查credentials.py语法

python -c "from credentials import llm_config; print(llm_config)"

undefined

python -c "from credentials import llm_config; print(llm_config)"

undefined

Memory Issues

记忆问题

Problem: Agent can't recall previous facts

Solution:

python

undefined

问题： Agent无法回忆之前的事实

解决方案：

python

undefined

Check global memory file exists

检查全局记忆文件是否存在

import os print(os.path.exists(os.path.expanduser("~/.photoagents/global_mem.txt")))

import os print(os.path.exists(os.path.expanduser"~/.photoagents/global_mem.txt"))

Manually verify content

手动验证内容

with open(os.path.expanduser("~/.photoagents/global_mem.txt")) as f: print(f.read())

with open(os.path.expanduser"~/.photoagents/global_mem.txt") as f: print(f.read())

Rebuild skill index if corrupted

如果技能索引损坏，重建索引

from photoagents.skills.skill_manager import rebuild_index rebuild_index()

undefined

from photoagents.skills.skill_manager import rebuild_index rebuild_index()

undefined

Browser Automation Fails

浏览器自动化失败

Problem: CDP bridge cannot connect to Chrome

Solution:

bash

undefined

问题： CDP桥无法连接到Chrome

解决方案：

bash

undefined

Ensure Chrome is installed and accessible

确保Chrome已安装且可访问

which google-chrome which chrome

Launch Chrome with remote debugging manually

手动启动带远程调试的Chrome

google-chrome --remote-debugging-port=9222

Check port availability

检查端口可用性

lsof -i :9222

undefined

lsof -i :9222

undefined

Session Archive Growth

会话归档占用过大

Problem:

~/.photoagents/sessions/

consuming too much disk

Solution:

bash

undefined

问题：

~/.photoagents/sessions/

占用过多磁盘空间

解决方案：

bash

undefined

Clean old sessions (older than 30 days)

清理30天前的旧会话

find ~/.photoagents/sessions/ -type f -mtime +30 -delete

Or configure auto-cleanup

或配置自动清理

python -c " from photoagents.core.cleanup import configure_auto_cleanup configure_auto_cleanup(max_age_days=30, max_size_mb=1000) "

undefined

python -c " from photoagents.core.cleanup import configure_auto_cleanup configure_auto_cleanup(max_age_days=30, max_size_mb=1000) "

undefined

Permission Errors

权限错误

Problem: Cannot write to

~/.photoagents/

Solution:

bash

undefined

问题： 无法写入

~/.photoagents/

解决方案：

bash

undefined

Fix ownership

修复所有权

sudo chown -R $USER:$USER ~/.photoagents/

Fix permissions

修复权限

chmod -R 755 ~/.photoagents/

undefined

chmod -R 755 ~/.photoagents/

undefined

Advanced Configuration

高级配置

Custom Tool Schema

自定义工具Schema

python

from photoagents.resources.tool_schema import register_custom_tool

schema = {
    "name": "analyze_metrics",
    "description": "Analyze system metrics and generate report",
    "parameters": {
        "type": "object",
        "properties": {
            "metric_type": {
                "type": "string",
                "enum": ["cpu", "memory", "disk", "network"]
            },
            "duration_hours": {
                "type": "integer",
                "minimum": 1,
                "maximum": 168
            }
        },
        "required": ["metric_type"]
    }
}

register_custom_tool(schema, implementation_function)

python

from photoagents.resources.tool_schema import register_custom_tool

schema = {
    "name": "analyze_metrics",
    "description": "分析系统指标并生成报告",
    "parameters": {
        "type": "object",
        "properties": {
            "metric_type": {
                "type": "string",
                "enum": ["cpu", "memory", "disk", "network"]
            },
            "duration_hours": {
                "type": "integer",
                "minimum": 1,
                "maximum": 168
            }
        },
        "required": ["metric_type"]
    }
}

register_custom_tool(schema, implementation_function)

Environment Variables Reference

环境变量参考

bash

undefined

bash

undefined

Required

必填

export PHOTOAGENTS_API_KEY=pk_live_xxx

LLM Providers (choose one or both for fallback)

LLM供应商（选择一个或同时配置用于故障转移）

export ANTHROPIC_API_KEY=sk-ant-xxx export OPENAI_API_KEY=sk-xxx

Optional integrations

可选集成

export LANGFUSE_PUBLIC_KEY=pk-lf-xxx export LANGFUSE_SECRET_KEY=sk-lf-xxx export LANGFUSE_HOST=https://cloud.langfuse.com

Chat platform bots (if using)

聊天平台机器人（如需使用）

export TELEGRAM_BOT_TOKEN=xxx export FEISHU_APP_ID=xxx export FEISHU_APP_SECRET=xxx

undefined

export TELEGRAM_BOT_TOKEN=xxx export FEISHU_APP_ID=xxx export FEISHU_APP_SECRET=xxx

undefined

photo-agents-autonomous-llm

Original

Translation

Photo Agents Autonomous LLM Skill

Photo Agents Autonomous LLM Skill

Overview

概述

Installation

安装

Basic Installation

基础安装

Full Installation with All Clients

包含所有客户端的完整安装

API Key Setup

API密钥配置

LLM Provider Configuration

LLM供应商配置

credentials.py

credentials.py

Option 1: Anthropic Claude

选项1：Anthropic Claude

Option 2: OpenAI GPT

选项2：OpenAI GPT

Option 3: Failover configuration

选项3：故障转移配置

Core Usage Patterns

核心使用模式

1. Interactive CLI Mode

1. 交互式CLI模式

Start interactive REPL

启动交互式REPL

The agent will prompt for tasks and execute them

Agent会提示任务并执行

with vision-grounded reasoning

基于视觉锚定推理

2. One-Shot Task Execution

2. 单次任务执行

Execute a single task

执行单个任务

With custom output path

指定自定义输出路径

3. Reflection/Watchdog Mode

3. 反思/监控模式

Run with reflection scheduler (self-evolving)

启动反思调度器（自主进化）

4. Programmatic Agent Session

4. 程序化Agent会话

Configure LLM

配置LLM

Create session

创建会话

Run agent loop

运行Agent循环

5. Custom Tool Integration

5. 自定义工具集成

Tool is now available to the agent

该工具现在可被Agent调用

GUI Client Options

GUI客户端选项

Streamlit Web App + WebView

Streamlit网页应用 + WebView

Launch web interface with native window

启动带原生窗口的网页界面

Service Hub (Start/Stop Services)

服务中心（启动/停止服务）

Launch control hub

启动控制中心

Desktop PyQt Application

桌面PyQt应用

Desktop Companion

桌面助手

Chat Platform Bots

聊天平台机器人

Telegram

Telegram

Feishu (Lark)

飞书（Lark）

WeCom

企业微信

DingTalk