photo-agents-autonomous-llm
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePhoto Agents Autonomous LLM Skill
Photo Agents Autonomous LLM Skill
Overview
概述
Photo Agents is a Python framework for building autonomous, self-evolving AI agents that ground their understanding in visual observations of the screen. Unlike traditional text-only agents, Photo Agents implements a perceive → reason → act cycle with a layered memory architecture inspired by biological cognition: vision input, bounded observations stored in layers (L1-L4), and skills the agent writes from real successes.
Key capabilities:
- Multi-provider LLM routing (Anthropic Claude, OpenAI GPT, failover sessions)
- Layered memory system (working/global/SOP/session archive)
- Physical execution tools (file I/O, sandboxed code, browser automation via Chrome DevTools Protocol)
- Multiple client interfaces (CLI, Streamlit web app, PyQt desktop, chat platform bots)
- Self-evolving through reflection and skill generation
Photo Agents是一个Python框架,用于构建以屏幕视觉观测为理解基础的自主进化AI Agent。与传统纯文本Agent不同,Photo Agents实现了感知→推理→行动的循环,并采用受生物认知启发的分层记忆架构:视觉输入、分层存储的有限观测(L1-L4),以及Agent从实际成功案例中编写的技能。
核心功能:
- 多供应商LLM路由(Anthropic Claude、OpenAI GPT、故障转移会话)
- 分层记忆系统(工作/全局/SOP/会话归档)
- 物理执行工具(文件I/O、沙箱代码、基于Chrome DevTools Protocol的浏览器自动化)
- 多客户端界面(CLI、Streamlit网页应用、PyQt桌面端、聊天平台机器人)
- 通过反思和技能生成实现自主进化
Installation
安装
Basic Installation
基础安装
bash
pip install photoagentsbash
pip install photoagentsFull Installation with All Clients
包含所有客户端的完整安装
bash
pip install "photoagents[all]"Requirements: Python 3.10+
bash
pip install "photoagents[all]"要求: Python 3.10+
API Key Setup
API密钥配置
Photo Agents requires a license key validated against .
https://photo-agents.com/v1/keys/validate- Get your key at: https://photo-agents.com/dashboard/keys
- Configure it (choose one method):
Environment variable:
bash
export PHOTOAGENTS_API_KEY=pk_live_your_key_hereConfig file ():
~/.photoagents/config.jsonjson
{
"api_key": "pk_live_your_key_here"
}Interactive prompt: Run any command and it will prompt you to enter and save the key.
Photo Agents需要通过验证的许可证密钥。
https://photo-agents.com/v1/keys/validate- 获取密钥:https://photo-agents.com/dashboard/keys
- 配置密钥(选择一种方式):
环境变量:
bash
export PHOTOAGENTS_API_KEY=pk_live_your_key_here配置文件 ():
~/.photoagents/config.jsonjson
{
"api_key": "pk_live_your_key_here"
}交互式提示: 运行任意命令,系统会提示您输入并保存密钥。
LLM Provider Configuration
LLM供应商配置
Create a file in your project root:
credentials.pypython
undefined在项目根目录创建文件:
credentials.pypython
undefinedcredentials.py
credentials.py
from photoagents.config.keys_template import LLMConfig, ProviderConfig
from photoagents.config.keys_template import LLMConfig, ProviderConfig
Option 1: Anthropic Claude
选项1:Anthropic Claude
llm_config = LLMConfig(
primary=ProviderConfig(
provider="anthropic",
api_key="${ANTHROPIC_API_KEY}", # Use env var
model="claude-3-5-sonnet-20241022"
)
)
llm_config = LLMConfig(
primary=ProviderConfig(
provider="anthropic",
api_key="${ANTHROPIC_API_KEY}", # 使用环境变量
model="claude-3-5-sonnet-20241022"
)
)
Option 2: OpenAI GPT
选项2:OpenAI GPT
llm_config = LLMConfig(
primary=ProviderConfig(
provider="openai",
api_key="${OPENAI_API_KEY}",
model="gpt-4o"
)
)
llm_config = LLMConfig(
primary=ProviderConfig(
provider="openai",
api_key="${OPENAI_API_KEY}",
model="gpt-4o"
)
)
Option 3: Failover configuration
选项3:故障转移配置
llm_config = LLMConfig(
primary=ProviderConfig(
provider="anthropic",
api_key="${ANTHROPIC_API_KEY}",
model="claude-3-5-sonnet-20241022"
),
fallback=ProviderConfig(
provider="openai",
api_key="${OPENAI_API_KEY}",
model="gpt-4o"
)
)
Or use JSON format (`credentials.json`):
```json
{
"primary": {
"provider": "anthropic",
"api_key": "${ANTHROPIC_API_KEY}",
"model": "claude-3-5-sonnet-20241022"
},
"fallback": {
"provider": "openai",
"api_key": "${OPENAI_API_KEY}",
"model": "gpt-4o"
}
}llm_config = LLMConfig(
primary=ProviderConfig(
provider="anthropic",
api_key="${ANTHROPIC_API_KEY}",
model="claude-3-5-sonnet-20241022"
),
fallback=ProviderConfig(
provider="openai",
api_key="${OPENAI_API_KEY}",
model="gpt-4o"
)
)
或使用JSON格式(`credentials.json`):
```json
{
"primary": {
"provider": "anthropic",
"api_key": "${ANTHROPIC_API_KEY}",
"model": "claude-3-5-sonnet-20241022"
},
"fallback": {
"provider": "openai",
"api_key": "${OPENAI_API_KEY}",
"model": "gpt-4o"
}
}Core Usage Patterns
核心使用模式
1. Interactive CLI Mode
1. 交互式CLI模式
bash
undefinedbash
undefinedStart interactive REPL
启动交互式REPL
python -m photoagents
python -m photoagents
The agent will prompt for tasks and execute them
Agent会提示任务并执行
with vision-grounded reasoning
基于视觉锚定推理
undefinedundefined2. One-Shot Task Execution
2. 单次任务执行
bash
undefinedbash
undefinedExecute a single task
执行单个任务
python -m photoagents --task my_analysis --input "Analyze the largest files in this directory"
python -m photoagents --task my_analysis --input "分析此目录中最大的文件"
With custom output path
指定自定义输出路径
python -m photoagents --task report --input "Generate system report" --output ./reports/
undefinedpython -m photoagents --task report --input "生成系统报告" --output ./reports/
undefined3. Reflection/Watchdog Mode
3. 反思/监控模式
bash
undefinedbash
undefinedRun with reflection scheduler (self-evolving)
启动反思调度器(自主进化)
python -m photoagents --reflect photoagents/evolution/scheduler.py
undefinedpython -m photoagents --reflect photoagents/evolution/scheduler.py
undefined4. Programmatic Agent Session
4. 程序化Agent会话
python
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
from photoagents.config.keys_template import LLMConfig, ProviderConfigpython
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
from photoagents.config.keys_template import LLMConfig, ProviderConfigConfigure LLM
配置LLM
llm_config = LLMConfig(
primary=ProviderConfig(
provider="anthropic",
api_key="${ANTHROPIC_API_KEY}",
model="claude-3-5-sonnet-20241022"
)
)
llm_config = LLMConfig(
primary=ProviderConfig(
provider="anthropic",
api_key="${ANTHROPIC_API_KEY}",
model="claude-3-5-sonnet-20241022"
)
)
Create session
创建会话
session = LLMSession(llm_config)
session = LLMSession(llm_config)
Run agent loop
运行Agent循环
result = run_agent_session(
task_name="file_analysis",
user_input="Find and summarize all Python files in the current directory",
session=session,
max_turns=10
)
print(f"Final output: {result}")
undefinedresult = run_agent_session(
task_name="file_analysis",
user_input="查找并总结当前目录下所有Python文件",
session=session,
max_turns=10
)
print(f"最终输出: {result}")
undefined5. Custom Tool Integration
5. 自定义工具集成
python
from photoagents.core.tool_dispatcher import register_tool
from typing import Dict, Any
@register_tool
def custom_analysis_tool(data: str, options: Dict[str, Any]) -> str:
"""
Custom tool for specialized analysis.
Args:
data: Input data to analyze
options: Configuration options
Returns:
Analysis results
"""
# Your custom logic here
result = f"Analyzed: {data} with options {options}"
return resultpython
from photoagents.core.tool_dispatcher import register_tool
from typing import Dict, Any
@register_tool
def custom_analysis_tool(data: str, options: Dict[str, Any]) -> str:
"""
用于专业分析的自定义工具。
参数:
data: 待分析的输入数据
options: 配置选项
返回:
分析结果
"""
# 自定义逻辑编写处
result = f"已分析: {data},选项为 {options}"
return resultTool is now available to the agent
该工具现在可被Agent调用
undefinedundefinedGUI Client Options
GUI客户端选项
Streamlit Web App + WebView
Streamlit网页应用 + WebView
bash
undefinedbash
undefinedLaunch web interface with native window
启动带原生窗口的网页界面
pythonw -m photoagents.cli.launcher
undefinedpythonw -m photoagents.cli.launcher
undefinedService Hub (Start/Stop Services)
服务中心(启动/停止服务)
bash
undefinedbash
undefinedLaunch control hub
启动控制中心
pythonw -m photoagents.cli.hub
undefinedpythonw -m photoagents.cli.hub
undefinedDesktop PyQt Application
桌面PyQt应用
bash
python -m photoagents.clients.desktop_appbash
python -m photoagents.clients.desktop_appDesktop Companion
桌面助手
bash
pythonw -m photoagents.clients.companion_v2bash
pythonw -m photoagents.clients.companion_v2Chat Platform Bots
聊天平台机器人
bash
undefinedbash
undefinedTelegram
Telegram
python -m photoagents.clients.telegram_client
python -m photoagents.clients.telegram_client
Feishu (Lark)
飞书(Lark)
python -m photoagents.clients.feishu_client
python -m photoagents.clients.feishu_client
WeCom
企业微信
python -m photoagents.clients.wecom_client
python -m photoagents.clients.wecom_client
DingTalk
钉钉
python -m photoagents.clients.dingtalk_client
python -m photoagents.clients.dingtalk_client
python -m photoagents.clients.qq_client
undefinedpython -m photoagents.clients.qq_client
undefinedLayered Memory System
分层记忆系统
Photo Agents uses a 4-layer memory architecture:
Photo Agents采用4层记忆架构:
L1: Working Memory
L1: 工作记忆
Short-term context for the current task (conversation turns, immediate observations).
当前任务的短期上下文(对话轮次、即时观测)。
L2: Global Memory
L2: 全局记忆
Long-term facts stored in .
~/.photoagents/global_mem.txtpython
from photoagents.core.memory import add_global_fact, search_global_memory长期事实存储于。
~/.photoagents/global_mem.txtpython
from photoagents.core.memory import add_global_fact, search_global_memoryAdd a fact
添加事实
add_global_fact("Project uses Python 3.11 and requires PostgreSQL 14+")
add_global_fact"项目使用Python 3.11,需要PostgreSQL 14+"
Search memory
搜索记忆
results = search_global_memory("database requirements")
undefinedresults = search_global_memory"数据库要求"
undefinedL3: Skills & SOPs
L3: 技能与SOP
Standard Operating Procedures the agent writes from successful executions.
python
from photoagents.skills.skill_manager import save_skill, load_skillAgent从成功执行案例中编写的标准操作流程。
python
from photoagents.skills.skill_manager import save_skill, load_skillSave a new skill
保存新技能
save_skill(
name="web_scraping_pattern",
code="""
def scrape_structured_data(url: str) -> dict:
# Implementation
pass
""",
description="Reliable pattern for scraping structured web data"
)
save_skill(
name="web_scraping_pattern",
code="""
def scrape_structured_data(url: str) -> dict:
# 实现代码
pass
""",
description="用于抓取结构化网页数据的可靠模式"
)
Load and use
加载并使用
skill = load_skill("web_scraping_pattern")
undefinedskill = load_skill"web_scraping_pattern"
undefinedL4: Session Archive
L4: 会话归档
Full raw session logs in .
~/.photoagents/sessions/完整原始会话日志存储于。
~/.photoagents/sessions/Browser Automation with CDP
基于CDP的浏览器自动化
Photo Agents includes Chrome DevTools Protocol integration for browser control:
python
from photoagents.web.cdp_bridge import CDPBridge
async def automate_browser():
async with CDPBridge() as browser:
# Navigate
await browser.navigate("https://example.com")
# Take screenshot
screenshot = await browser.screenshot()
# Execute JavaScript
result = await browser.evaluate("document.title")
# Click element
await browser.click("button.submit")
# Fill form
await browser.type("input[name='query']", "search term")
return resultPhoto Agents集成了Chrome DevTools Protocol用于浏览器控制:
python
from photoagents.web.cdp_bridge import CDPBridge
async def automate_browser():
async with CDPBridge() as browser:
# 导航
await browser.navigate"https://example.com"
# 截图
screenshot = await browser.screenshot()
# 执行JavaScript
result = await browser.evaluate"document.title"
# 点击元素
await browser.click"button.submit"
# 填写表单
await browser.type"input[name='query']", "搜索关键词"
return resultVision-Grounded Operations
视觉锚定操作
Screenshot Analysis
截图分析
python
from photoagents.skills.vision import analyze_screenshotpython
from photoagents.skills.vision import analyze_screenshotAgent automatically captures and analyzes screen
Agent自动捕获并分析屏幕
analysis = analyze_screenshot(
region=(0, 0, 1920, 1080), # x, y, width, height
question="What UI elements are visible?"
)
undefinedanalysis = analyze_screenshot(
region=(0, 0, 1920, 1080), # x, y, 宽度, 高度
question="可见哪些UI元素?"
)
undefinedOCR Text Extraction
OCR文本提取
python
from photoagents.skills.ocr import extract_text_from_regionpython
from photoagents.skills.ocr import extract_text_from_regionExtract text from screen region
提取屏幕区域内的文本
text = extract_text_from_region(
x=100, y=200, width=500, height=300
)
undefinedtext = extract_text_from_region(
x=100, y=200, width=500, height=300
)
undefinedSandboxed Code Execution
沙箱代码执行
python
from photoagents.core.sandbox import execute_codepython
from photoagents.core.sandbox import execute_codePython execution
Python代码执行
result = execute_code(
code="""
import json
data = {"status": "success"}
print(json.dumps(data))
""",
language="python",
timeout=30
)
result = execute_code(
code="""
import json
data = {"status": "success"}
print(json.dumps(data))
""",
language="python",
timeout=30
)
PowerShell (Windows)
PowerShell(Windows)
ps_result = execute_code(
code="Get-Process | Select-Object -First 5",
language="powershell"
)
ps_result = execute_code(
code="Get-Process | Select-Object -First 5",
language="powershell"
)
Bash (Linux/Mac)
Bash(Linux/Mac)
bash_result = execute_code(
code="ls -la | head -n 10",
language="bash"
)
undefinedbash_result = execute_code(
code="ls -la | head -n 10",
language="bash"
)
undefinedFile I/O Operations
文件I/O操作
python
from photoagents.core.file_ops import read_file, write_file, list_directorypython
from photoagents.core.file_ops import read_file, write_file, list_directoryRead file
读取文件
content = read_file("~/project/config.json")
content = read_file"~/project/config.json"
Write file
写入文件
write_file("~/output/report.txt", "Analysis complete\n")
write_file"~/output/report.txt", "分析完成\n"
List directory with filters
带过滤条件的目录列表
files = list_directory(
path="~/project",
pattern="*.py",
recursive=True
)
undefinedfiles = list_directory(
path="~/project",
pattern="*.py",
recursive=True
)
undefinedObservability with Langfuse
基于Langfuse的可观测性
python
from photoagents.integrations.langfuse_tracer import init_langfuse, trace_agent_steppython
from photoagents.integrations.langfuse_tracer import init_langfuse, trace_agent_stepInitialize
初始化
tracer = init_langfuse(
public_key="${LANGFUSE_PUBLIC_KEY}",
secret_key="${LANGFUSE_SECRET_KEY}",
host="https://cloud.langfuse.com"
)
tracer = init_langfuse(
public_key="${LANGFUSE_PUBLIC_KEY}",
secret_key="${LANGFUSE_SECRET_KEY}",
host="https://cloud.langfuse.com"
)
Trace agent steps
追踪Agent步骤
with trace_agent_step("file_analysis", metadata={"task": "analyze_logs"}):
# Agent operations here
pass
undefinedwith trace_agent_step"file_analysis", metadata={"task": "analyze_logs"}:
# Agent操作编写处
pass
undefinedConfiguration Files
配置文件
On-Disk State Locations
磁盘状态位置
| Path | Purpose |
|---|---|
| API key + license validation cache |
| L2 long-term facts |
| L4 raw session archives |
| Vector index for skill/SOP search |
| Per-task scratch (logs, intermediate output) |
| 路径 | 用途 |
|---|---|
| API密钥 + 许可证验证缓存 |
| L2长期事实存储 |
| L4原始会话归档 |
| 技能/SOP搜索的向量索引 |
| 任务临时文件(日志、中间输出) |
Custom System Prompt
自定义系统提示词
Override the default system prompt:
python
from photoagents.core.loop import run_agent_session
custom_prompt = """
You are a specialized data analysis agent.
Focus on: statistical analysis, visualization, and reporting.
Always verify data integrity before processing.
"""
result = run_agent_session(
task_name="analysis",
user_input="Analyze sales data",
system_prompt_override=custom_prompt
)覆盖默认系统提示词:
python
from photoagents.core.loop import run_agent_session
custom_prompt = """
你是一名专业的数据分析师Agent。
专注于:统计分析、可视化和报告。
处理前务必验证数据完整性。
"""
result = run_agent_session(
task_name="analysis",
user_input="分析销售数据",
system_prompt_override=custom_prompt
)Common Patterns
常见模式
Pattern 1: Autonomous Research Agent
模式1:自主研究Agent
python
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
def create_research_agent(topic: str):
session = LLMSession.from_env()
result = run_agent_session(
task_name=f"research_{topic}",
user_input=f"""
Research {topic} and create a comprehensive report:
1. Search for recent information
2. Analyze credibility of sources
3. Synthesize findings
4. Save report with citations
""",
session=session,
max_turns=50
)
return resultpython
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
def create_research_agent(topic: str):
session = LLMSession.from_env()
result = run_agent_session(
task_name=f"research_{topic}",
user_input=f"""
研究{topic}并生成全面报告:
1. 搜索最新信息
2. 分析来源可信度
3. 整合研究结果
4. 保存带引用的报告
""",
session=session,
max_turns=50
)
return resultUse it
使用示例
report = create_research_agent("quantum computing advances 2026")
undefinedreport = create_research_agent"2026年量子计算进展"
undefinedPattern 2: Self-Evolving Monitor
模式2:自主进化监控器
python
undefinedpython
undefinedmonitor.py
monitor.py
from photoagents.evolution.scheduler import schedule_check
def check() -> bool:
"""
Watchdog function that triggers agent tasks.
Return True to execute a task.
"""
import os
import time
# Check if it's time to run daily backup
last_run = os.path.getmtime("~/.photoagents/last_backup")
if time.time() - last_run > 86400: # 24 hours
return True
return Falsedef get_task() -> str:
"""Return the task to execute when check() returns True."""
return "Backup all project files to ~/backups/ and verify integrity"
from photoagents.evolution.scheduler import schedule_check
def check() -> bool:
"""
触发Agent任务的监控函数。
返回True时执行任务。
"""
import os
import time
# 检查是否到每日备份时间
last_run = os.path.getmtime"~/.photoagents/last_backup"
if time.time() - last_run > 86400: # 24小时
return True
return Falsedef get_task() -> str:
"""当check()返回True时,返回要执行的任务。"""
return "将所有项目文件备份到~/backups/并验证完整性"
Run with:
运行方式:
python -m photoagents --reflect monitor.py
python -m photoagents --reflect monitor.py
undefinedundefinedPattern 3: Multi-Step Workflow
模式3:多步骤工作流
python
from photoagents.core.loop import run_agent_session
from photoagents.core.memory import add_global_fact
def execute_workflow(project_path: str):
# Step 1: Analyze codebase
analysis = run_agent_session(
task_name="code_analysis",
user_input=f"Analyze Python code structure in {project_path}"
)
# Save insight to global memory
add_global_fact(f"Project at {project_path}: {analysis}")
# Step 2: Generate documentation
docs = run_agent_session(
task_name="generate_docs",
user_input=f"Create API documentation for {project_path}"
)
# Step 3: Run tests
tests = run_agent_session(
task_name="run_tests",
user_input=f"Execute test suite and report coverage"
)
return {
"analysis": analysis,
"documentation": docs,
"tests": tests
}python
from photoagents.core.loop import run_agent_session
from photoagents.core.memory import add_global_fact
def execute_workflow(project_path: str):
# 步骤1:分析代码库
analysis = run_agent_session(
task_name="code_analysis",
user_input=f"分析{project_path}中的Python代码结构"
)
# 将洞察保存到全局记忆
add_global_fact(f"{project_path}项目:{analysis}")
# 步骤2:生成文档
docs = run_agent_session(
task_name="generate_docs",
user_input=f"为{project_path}创建API文档"
)
# 步骤3:运行测试
tests = run_agent_session(
task_name="run_tests",
user_input=f"执行测试套件并报告覆盖率"
)
return {
"analysis": analysis,
"documentation": docs,
"tests": tests
}Troubleshooting
故障排查
API Key Issues
API密钥问题
Problem:
PhotoAgentsAuthError: Invalid or missing API keySolution:
bash
undefined问题:
PhotoAgentsAuthError: Invalid or missing API key解决方案:
bash
undefinedVerify key is set
验证密钥已设置
echo $PHOTOAGENTS_API_KEY
echo $PHOTOAGENTS_API_KEY
Or check config file
或检查配置文件
cat ~/.photoagents/config.json
cat ~/.photoagents/config.json
Clear cache if key was recently updated
如果密钥最近更新,清除缓存
rm ~/.photoagents/config.json
undefinedrm ~/.photoagents/config.json
undefinedLLM Provider Errors
LLM供应商错误
Problem:
LLM provider authentication failedSolution:
bash
undefined问题:
LLM provider authentication failed解决方案:
bash
undefinedVerify environment variables are set
验证环境变量已设置
echo $ANTHROPIC_API_KEY
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY
echo $OPENAI_API_KEY
Test credentials.py is in correct location
验证credentials.py在正确位置
ls credentials.py
ls credentials.py
Check credentials.py syntax
检查credentials.py语法
python -c "from credentials import llm_config; print(llm_config)"
undefinedpython -c "from credentials import llm_config; print(llm_config)"
undefinedMemory Issues
记忆问题
Problem: Agent can't recall previous facts
Solution:
python
undefined问题: Agent无法回忆之前的事实
解决方案:
python
undefinedCheck global memory file exists
检查全局记忆文件是否存在
import os
print(os.path.exists(os.path.expanduser("~/.photoagents/global_mem.txt")))
import os
print(os.path.exists(os.path.expanduser"~/.photoagents/global_mem.txt"))
Manually verify content
手动验证内容
with open(os.path.expanduser("~/.photoagents/global_mem.txt")) as f:
print(f.read())
with open(os.path.expanduser"~/.photoagents/global_mem.txt") as f:
print(f.read())
Rebuild skill index if corrupted
如果技能索引损坏,重建索引
from photoagents.skills.skill_manager import rebuild_index
rebuild_index()
undefinedfrom photoagents.skills.skill_manager import rebuild_index
rebuild_index()
undefinedBrowser Automation Fails
浏览器自动化失败
Problem: CDP bridge cannot connect to Chrome
Solution:
bash
undefined问题: CDP桥无法连接到Chrome
解决方案:
bash
undefinedEnsure Chrome is installed and accessible
确保Chrome已安装且可访问
which google-chrome
which chrome
which google-chrome
which chrome
Launch Chrome with remote debugging manually
手动启动带远程调试的Chrome
google-chrome --remote-debugging-port=9222
google-chrome --remote-debugging-port=9222
Check port availability
检查端口可用性
lsof -i :9222
undefinedlsof -i :9222
undefinedSession Archive Growth
会话归档占用过大
Problem: consuming too much disk
~/.photoagents/sessions/Solution:
bash
undefined问题: 占用过多磁盘空间
~/.photoagents/sessions/解决方案:
bash
undefinedClean old sessions (older than 30 days)
清理30天前的旧会话
find ~/.photoagents/sessions/ -type f -mtime +30 -delete
find ~/.photoagents/sessions/ -type f -mtime +30 -delete
Or configure auto-cleanup
或配置自动清理
python -c "
from photoagents.core.cleanup import configure_auto_cleanup
configure_auto_cleanup(max_age_days=30, max_size_mb=1000)
"
undefinedpython -c "
from photoagents.core.cleanup import configure_auto_cleanup
configure_auto_cleanup(max_age_days=30, max_size_mb=1000)
"
undefinedPermission Errors
权限错误
Problem: Cannot write to
~/.photoagents/Solution:
bash
undefined问题: 无法写入
~/.photoagents/解决方案:
bash
undefinedFix ownership
修复所有权
sudo chown -R $USER:$USER ~/.photoagents/
sudo chown -R $USER:$USER ~/.photoagents/
Fix permissions
修复权限
chmod -R 755 ~/.photoagents/
undefinedchmod -R 755 ~/.photoagents/
undefinedAdvanced Configuration
高级配置
Custom Tool Schema
自定义工具Schema
python
from photoagents.resources.tool_schema import register_custom_tool
schema = {
"name": "analyze_metrics",
"description": "Analyze system metrics and generate report",
"parameters": {
"type": "object",
"properties": {
"metric_type": {
"type": "string",
"enum": ["cpu", "memory", "disk", "network"]
},
"duration_hours": {
"type": "integer",
"minimum": 1,
"maximum": 168
}
},
"required": ["metric_type"]
}
}
register_custom_tool(schema, implementation_function)python
from photoagents.resources.tool_schema import register_custom_tool
schema = {
"name": "analyze_metrics",
"description": "分析系统指标并生成报告",
"parameters": {
"type": "object",
"properties": {
"metric_type": {
"type": "string",
"enum": ["cpu", "memory", "disk", "network"]
},
"duration_hours": {
"type": "integer",
"minimum": 1,
"maximum": 168
}
},
"required": ["metric_type"]
}
}
register_custom_tool(schema, implementation_function)Environment Variables Reference
环境变量参考
bash
undefinedbash
undefinedRequired
必填
export PHOTOAGENTS_API_KEY=pk_live_xxx
export PHOTOAGENTS_API_KEY=pk_live_xxx
LLM Providers (choose one or both for fallback)
LLM供应商(选择一个或同时配置用于故障转移)
export ANTHROPIC_API_KEY=sk-ant-xxx
export OPENAI_API_KEY=sk-xxx
export ANTHROPIC_API_KEY=sk-ant-xxx
export OPENAI_API_KEY=sk-xxx
Optional integrations
可选集成
export LANGFUSE_PUBLIC_KEY=pk-lf-xxx
export LANGFUSE_SECRET_KEY=sk-lf-xxx
export LANGFUSE_HOST=https://cloud.langfuse.com
export LANGFUSE_PUBLIC_KEY=pk-lf-xxx
export LANGFUSE_SECRET_KEY=sk-lf-xxx
export LANGFUSE_HOST=https://cloud.langfuse.com
Chat platform bots (if using)
聊天平台机器人(如需使用)
export TELEGRAM_BOT_TOKEN=xxx
export FEISHU_APP_ID=xxx
export FEISHU_APP_SECRET=xxx
undefinedexport TELEGRAM_BOT_TOKEN=xxx
export FEISHU_APP_ID=xxx
export FEISHU_APP_SECRET=xxx
undefined