photo-agents-autonomous-llm

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Photo Agents Autonomous LLM Skill

Photo Agents Autonomous LLM Skill

Skill by ara.so — AI Agent Skills collection.
ara.so开发的Skill — AI Agent技能集合。

Overview

概述

Photo Agents is a Python framework for building autonomous, self-evolving AI agents that ground their understanding in visual observations of the screen. Unlike traditional text-only agents, Photo Agents implements a perceive → reason → act cycle with a layered memory architecture inspired by biological cognition: vision input, bounded observations stored in layers (L1-L4), and skills the agent writes from real successes.
Key capabilities:
  • Multi-provider LLM routing (Anthropic Claude, OpenAI GPT, failover sessions)
  • Layered memory system (working/global/SOP/session archive)
  • Physical execution tools (file I/O, sandboxed code, browser automation via Chrome DevTools Protocol)
  • Multiple client interfaces (CLI, Streamlit web app, PyQt desktop, chat platform bots)
  • Self-evolving through reflection and skill generation
Photo Agents是一个Python框架,用于构建以屏幕视觉观测为理解基础的自主进化AI Agent。与传统纯文本Agent不同,Photo Agents实现了感知→推理→行动的循环,并采用受生物认知启发的分层记忆架构:视觉输入、分层存储的有限观测(L1-L4),以及Agent从实际成功案例中编写的技能。
核心功能:
  • 多供应商LLM路由(Anthropic Claude、OpenAI GPT、故障转移会话)
  • 分层记忆系统(工作/全局/SOP/会话归档)
  • 物理执行工具(文件I/O、沙箱代码、基于Chrome DevTools Protocol的浏览器自动化)
  • 多客户端界面(CLI、Streamlit网页应用、PyQt桌面端、聊天平台机器人)
  • 通过反思和技能生成实现自主进化

Installation

安装

Basic Installation

基础安装

bash
pip install photoagents
bash
pip install photoagents

Full Installation with All Clients

包含所有客户端的完整安装

bash
pip install "photoagents[all]"
Requirements: Python 3.10+
bash
pip install "photoagents[all]"
要求: Python 3.10+

API Key Setup

API密钥配置

Photo Agents requires a license key validated against
https://photo-agents.com/v1/keys/validate
.
  1. Get your key at: https://photo-agents.com/dashboard/keys
  2. Configure it (choose one method):
Environment variable:
bash
export PHOTOAGENTS_API_KEY=pk_live_your_key_here
Config file (
~/.photoagents/config.json
):
json
{
  "api_key": "pk_live_your_key_here"
}
Interactive prompt: Run any command and it will prompt you to enter and save the key.
Photo Agents需要通过
https://photo-agents.com/v1/keys/validate
验证的许可证密钥。
  1. 获取密钥:https://photo-agents.com/dashboard/keys
  2. 配置密钥(选择一种方式):
环境变量:
bash
export PHOTOAGENTS_API_KEY=pk_live_your_key_here
配置文件 (
~/.photoagents/config.json
):
json
{
  "api_key": "pk_live_your_key_here"
}
交互式提示: 运行任意命令,系统会提示您输入并保存密钥。

LLM Provider Configuration

LLM供应商配置

Create a
credentials.py
file in your project root:
python
undefined
在项目根目录创建
credentials.py
文件:
python
undefined

credentials.py

credentials.py

from photoagents.config.keys_template import LLMConfig, ProviderConfig
from photoagents.config.keys_template import LLMConfig, ProviderConfig

Option 1: Anthropic Claude

选项1:Anthropic Claude

llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", # Use env var model="claude-3-5-sonnet-20241022" ) )
llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", # 使用环境变量 model="claude-3-5-sonnet-20241022" ) )

Option 2: OpenAI GPT

选项2:OpenAI GPT

llm_config = LLMConfig( primary=ProviderConfig( provider="openai", api_key="${OPENAI_API_KEY}", model="gpt-4o" ) )
llm_config = LLMConfig( primary=ProviderConfig( provider="openai", api_key="${OPENAI_API_KEY}", model="gpt-4o" ) )

Option 3: Failover configuration

选项3:故障转移配置

llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", model="claude-3-5-sonnet-20241022" ), fallback=ProviderConfig( provider="openai", api_key="${OPENAI_API_KEY}", model="gpt-4o" ) )

Or use JSON format (`credentials.json`):

```json
{
  "primary": {
    "provider": "anthropic",
    "api_key": "${ANTHROPIC_API_KEY}",
    "model": "claude-3-5-sonnet-20241022"
  },
  "fallback": {
    "provider": "openai",
    "api_key": "${OPENAI_API_KEY}",
    "model": "gpt-4o"
  }
}
llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", model="claude-3-5-sonnet-20241022" ), fallback=ProviderConfig( provider="openai", api_key="${OPENAI_API_KEY}", model="gpt-4o" ) )

或使用JSON格式(`credentials.json`):

```json
{
  "primary": {
    "provider": "anthropic",
    "api_key": "${ANTHROPIC_API_KEY}",
    "model": "claude-3-5-sonnet-20241022"
  },
  "fallback": {
    "provider": "openai",
    "api_key": "${OPENAI_API_KEY}",
    "model": "gpt-4o"
  }
}

Core Usage Patterns

核心使用模式

1. Interactive CLI Mode

1. 交互式CLI模式

bash
undefined
bash
undefined

Start interactive REPL

启动交互式REPL

python -m photoagents
python -m photoagents

The agent will prompt for tasks and execute them

Agent会提示任务并执行

with vision-grounded reasoning

基于视觉锚定推理

undefined
undefined

2. One-Shot Task Execution

2. 单次任务执行

bash
undefined
bash
undefined

Execute a single task

执行单个任务

python -m photoagents --task my_analysis --input "Analyze the largest files in this directory"
python -m photoagents --task my_analysis --input "分析此目录中最大的文件"

With custom output path

指定自定义输出路径

python -m photoagents --task report --input "Generate system report" --output ./reports/
undefined
python -m photoagents --task report --input "生成系统报告" --output ./reports/
undefined

3. Reflection/Watchdog Mode

3. 反思/监控模式

bash
undefined
bash
undefined

Run with reflection scheduler (self-evolving)

启动反思调度器(自主进化)

python -m photoagents --reflect photoagents/evolution/scheduler.py
undefined
python -m photoagents --reflect photoagents/evolution/scheduler.py
undefined

4. Programmatic Agent Session

4. 程序化Agent会话

python
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
from photoagents.config.keys_template import LLMConfig, ProviderConfig
python
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession
from photoagents.config.keys_template import LLMConfig, ProviderConfig

Configure LLM

配置LLM

llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", model="claude-3-5-sonnet-20241022" ) )
llm_config = LLMConfig( primary=ProviderConfig( provider="anthropic", api_key="${ANTHROPIC_API_KEY}", model="claude-3-5-sonnet-20241022" ) )

Create session

创建会话

session = LLMSession(llm_config)
session = LLMSession(llm_config)

Run agent loop

运行Agent循环

result = run_agent_session( task_name="file_analysis", user_input="Find and summarize all Python files in the current directory", session=session, max_turns=10 )
print(f"Final output: {result}")
undefined
result = run_agent_session( task_name="file_analysis", user_input="查找并总结当前目录下所有Python文件", session=session, max_turns=10 )
print(f"最终输出: {result}")
undefined

5. Custom Tool Integration

5. 自定义工具集成

python
from photoagents.core.tool_dispatcher import register_tool
from typing import Dict, Any

@register_tool
def custom_analysis_tool(data: str, options: Dict[str, Any]) -> str:
    """
    Custom tool for specialized analysis.
    
    Args:
        data: Input data to analyze
        options: Configuration options
        
    Returns:
        Analysis results
    """
    # Your custom logic here
    result = f"Analyzed: {data} with options {options}"
    return result
python
from photoagents.core.tool_dispatcher import register_tool
from typing import Dict, Any

@register_tool
def custom_analysis_tool(data: str, options: Dict[str, Any]) -> str:
    """
    用于专业分析的自定义工具。
    
    参数:
        data: 待分析的输入数据
        options: 配置选项
        
    返回:
        分析结果
    """
    # 自定义逻辑编写处
    result = f"已分析: {data},选项为 {options}"
    return result

Tool is now available to the agent

该工具现在可被Agent调用

undefined
undefined

GUI Client Options

GUI客户端选项

Streamlit Web App + WebView

Streamlit网页应用 + WebView

bash
undefined
bash
undefined

Launch web interface with native window

启动带原生窗口的网页界面

pythonw -m photoagents.cli.launcher
undefined
pythonw -m photoagents.cli.launcher
undefined

Service Hub (Start/Stop Services)

服务中心(启动/停止服务)

bash
undefined
bash
undefined

Launch control hub

启动控制中心

pythonw -m photoagents.cli.hub
undefined
pythonw -m photoagents.cli.hub
undefined

Desktop PyQt Application

桌面PyQt应用

bash
python -m photoagents.clients.desktop_app
bash
python -m photoagents.clients.desktop_app

Desktop Companion

桌面助手

bash
pythonw -m photoagents.clients.companion_v2
bash
pythonw -m photoagents.clients.companion_v2

Chat Platform Bots

聊天平台机器人

bash
undefined
bash
undefined

Telegram

Telegram

python -m photoagents.clients.telegram_client
python -m photoagents.clients.telegram_client

Feishu (Lark)

飞书(Lark)

python -m photoagents.clients.feishu_client
python -m photoagents.clients.feishu_client

WeCom

企业微信

python -m photoagents.clients.wecom_client
python -m photoagents.clients.wecom_client

DingTalk

钉钉

python -m photoagents.clients.dingtalk_client
python -m photoagents.clients.dingtalk_client

QQ

QQ

python -m photoagents.clients.qq_client
undefined
python -m photoagents.clients.qq_client
undefined

Layered Memory System

分层记忆系统

Photo Agents uses a 4-layer memory architecture:
Photo Agents采用4层记忆架构:

L1: Working Memory

L1: 工作记忆

Short-term context for the current task (conversation turns, immediate observations).
当前任务的短期上下文(对话轮次、即时观测)。

L2: Global Memory

L2: 全局记忆

Long-term facts stored in
~/.photoagents/global_mem.txt
.
python
from photoagents.core.memory import add_global_fact, search_global_memory
长期事实存储于
~/.photoagents/global_mem.txt
python
from photoagents.core.memory import add_global_fact, search_global_memory

Add a fact

添加事实

add_global_fact("Project uses Python 3.11 and requires PostgreSQL 14+")
add_global_fact"项目使用Python 3.11,需要PostgreSQL 14+"

Search memory

搜索记忆

results = search_global_memory("database requirements")
undefined
results = search_global_memory"数据库要求"
undefined

L3: Skills & SOPs

L3: 技能与SOP

Standard Operating Procedures the agent writes from successful executions.
python
from photoagents.skills.skill_manager import save_skill, load_skill
Agent从成功执行案例中编写的标准操作流程。
python
from photoagents.skills.skill_manager import save_skill, load_skill

Save a new skill

保存新技能

save_skill( name="web_scraping_pattern", code=""" def scrape_structured_data(url: str) -> dict: # Implementation pass """, description="Reliable pattern for scraping structured web data" )
save_skill( name="web_scraping_pattern", code=""" def scrape_structured_data(url: str) -> dict: # 实现代码 pass """, description="用于抓取结构化网页数据的可靠模式" )

Load and use

加载并使用

skill = load_skill("web_scraping_pattern")
undefined
skill = load_skill"web_scraping_pattern"
undefined

L4: Session Archive

L4: 会话归档

Full raw session logs in
~/.photoagents/sessions/
.
完整原始会话日志存储于
~/.photoagents/sessions/

Browser Automation with CDP

基于CDP的浏览器自动化

Photo Agents includes Chrome DevTools Protocol integration for browser control:
python
from photoagents.web.cdp_bridge import CDPBridge

async def automate_browser():
    async with CDPBridge() as browser:
        # Navigate
        await browser.navigate("https://example.com")
        
        # Take screenshot
        screenshot = await browser.screenshot()
        
        # Execute JavaScript
        result = await browser.evaluate("document.title")
        
        # Click element
        await browser.click("button.submit")
        
        # Fill form
        await browser.type("input[name='query']", "search term")
        
    return result
Photo Agents集成了Chrome DevTools Protocol用于浏览器控制:
python
from photoagents.web.cdp_bridge import CDPBridge

async def automate_browser():
    async with CDPBridge() as browser:
        # 导航
        await browser.navigate"https://example.com"
        
        # 截图
        screenshot = await browser.screenshot()
        
        # 执行JavaScript
        result = await browser.evaluate"document.title"
        
        # 点击元素
        await browser.click"button.submit"
        
        # 填写表单
        await browser.type"input[name='query']", "搜索关键词"
        
    return result

Vision-Grounded Operations

视觉锚定操作

Screenshot Analysis

截图分析

python
from photoagents.skills.vision import analyze_screenshot
python
from photoagents.skills.vision import analyze_screenshot

Agent automatically captures and analyzes screen

Agent自动捕获并分析屏幕

analysis = analyze_screenshot( region=(0, 0, 1920, 1080), # x, y, width, height question="What UI elements are visible?" )
undefined
analysis = analyze_screenshot( region=(0, 0, 1920, 1080), # x, y, 宽度, 高度 question="可见哪些UI元素?" )
undefined

OCR Text Extraction

OCR文本提取

python
from photoagents.skills.ocr import extract_text_from_region
python
from photoagents.skills.ocr import extract_text_from_region

Extract text from screen region

提取屏幕区域内的文本

text = extract_text_from_region( x=100, y=200, width=500, height=300 )
undefined
text = extract_text_from_region( x=100, y=200, width=500, height=300 )
undefined

Sandboxed Code Execution

沙箱代码执行

python
from photoagents.core.sandbox import execute_code
python
from photoagents.core.sandbox import execute_code

Python execution

Python代码执行

result = execute_code( code=""" import json data = {"status": "success"} print(json.dumps(data)) """, language="python", timeout=30 )
result = execute_code( code=""" import json data = {"status": "success"} print(json.dumps(data)) """, language="python", timeout=30 )

PowerShell (Windows)

PowerShell(Windows)

ps_result = execute_code( code="Get-Process | Select-Object -First 5", language="powershell" )
ps_result = execute_code( code="Get-Process | Select-Object -First 5", language="powershell" )

Bash (Linux/Mac)

Bash(Linux/Mac)

bash_result = execute_code( code="ls -la | head -n 10", language="bash" )
undefined
bash_result = execute_code( code="ls -la | head -n 10", language="bash" )
undefined

File I/O Operations

文件I/O操作

python
from photoagents.core.file_ops import read_file, write_file, list_directory
python
from photoagents.core.file_ops import read_file, write_file, list_directory

Read file

读取文件

content = read_file("~/project/config.json")
content = read_file"~/project/config.json"

Write file

写入文件

write_file("~/output/report.txt", "Analysis complete\n")
write_file"~/output/report.txt", "分析完成\n"

List directory with filters

带过滤条件的目录列表

files = list_directory( path="~/project", pattern="*.py", recursive=True )
undefined
files = list_directory( path="~/project", pattern="*.py", recursive=True )
undefined

Observability with Langfuse

基于Langfuse的可观测性

python
from photoagents.integrations.langfuse_tracer import init_langfuse, trace_agent_step
python
from photoagents.integrations.langfuse_tracer import init_langfuse, trace_agent_step

Initialize

初始化

tracer = init_langfuse( public_key="${LANGFUSE_PUBLIC_KEY}", secret_key="${LANGFUSE_SECRET_KEY}", host="https://cloud.langfuse.com" )
tracer = init_langfuse( public_key="${LANGFUSE_PUBLIC_KEY}", secret_key="${LANGFUSE_SECRET_KEY}", host="https://cloud.langfuse.com" )

Trace agent steps

追踪Agent步骤

with trace_agent_step("file_analysis", metadata={"task": "analyze_logs"}): # Agent operations here pass
undefined
with trace_agent_step"file_analysis", metadata={"task": "analyze_logs"}: # Agent操作编写处 pass
undefined

Configuration Files

配置文件

On-Disk State Locations

磁盘状态位置

PathPurpose
~/.photoagents/config.json
API key + license validation cache
~/.photoagents/global_mem.txt
L2 long-term facts
~/.photoagents/sessions/
L4 raw session archives
~/.photoagents/skill_index/
Vector index for skill/SOP search
~/.photoagents/temp/
Per-task scratch (logs, intermediate output)
路径用途
~/.photoagents/config.json
API密钥 + 许可证验证缓存
~/.photoagents/global_mem.txt
L2长期事实存储
~/.photoagents/sessions/
L4原始会话归档
~/.photoagents/skill_index/
技能/SOP搜索的向量索引
~/.photoagents/temp/
任务临时文件(日志、中间输出)

Custom System Prompt

自定义系统提示词

Override the default system prompt:
python
from photoagents.core.loop import run_agent_session

custom_prompt = """
You are a specialized data analysis agent.
Focus on: statistical analysis, visualization, and reporting.
Always verify data integrity before processing.
"""

result = run_agent_session(
    task_name="analysis",
    user_input="Analyze sales data",
    system_prompt_override=custom_prompt
)
覆盖默认系统提示词:
python
from photoagents.core.loop import run_agent_session

custom_prompt = """
你是一名专业的数据分析师Agent。
专注于:统计分析、可视化和报告。
处理前务必验证数据完整性。
"""

result = run_agent_session(
    task_name="analysis",
    user_input="分析销售数据",
    system_prompt_override=custom_prompt
)

Common Patterns

常见模式

Pattern 1: Autonomous Research Agent

模式1:自主研究Agent

python
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession

def create_research_agent(topic: str):
    session = LLMSession.from_env()
    
    result = run_agent_session(
        task_name=f"research_{topic}",
        user_input=f"""
        Research {topic} and create a comprehensive report:
        1. Search for recent information
        2. Analyze credibility of sources
        3. Synthesize findings
        4. Save report with citations
        """,
        session=session,
        max_turns=50
    )
    
    return result
python
from photoagents.core.loop import run_agent_session
from photoagents.llm.router import LLMSession

def create_research_agent(topic: str):
    session = LLMSession.from_env()
    
    result = run_agent_session(
        task_name=f"research_{topic}",
        user_input=f"""
        研究{topic}并生成全面报告:
        1. 搜索最新信息
        2. 分析来源可信度
        3. 整合研究结果
        4. 保存带引用的报告
        """,
        session=session,
        max_turns=50
    )
    
    return result

Use it

使用示例

report = create_research_agent("quantum computing advances 2026")
undefined
report = create_research_agent"2026年量子计算进展"
undefined

Pattern 2: Self-Evolving Monitor

模式2:自主进化监控器

python
undefined
python
undefined

monitor.py

monitor.py

from photoagents.evolution.scheduler import schedule_check
def check() -> bool: """ Watchdog function that triggers agent tasks. Return True to execute a task. """ import os import time
# Check if it's time to run daily backup
last_run = os.path.getmtime("~/.photoagents/last_backup")
if time.time() - last_run > 86400:  # 24 hours
    return True

return False
def get_task() -> str: """Return the task to execute when check() returns True.""" return "Backup all project files to ~/backups/ and verify integrity"
from photoagents.evolution.scheduler import schedule_check
def check() -> bool: """ 触发Agent任务的监控函数。 返回True时执行任务。 """ import os import time
# 检查是否到每日备份时间
last_run = os.path.getmtime"~/.photoagents/last_backup"
if time.time() - last_run > 86400:  # 24小时
    return True

return False
def get_task() -> str: """当check()返回True时,返回要执行的任务。""" return "将所有项目文件备份到~/backups/并验证完整性"

Run with:

运行方式:

python -m photoagents --reflect monitor.py

python -m photoagents --reflect monitor.py

undefined
undefined

Pattern 3: Multi-Step Workflow

模式3:多步骤工作流

python
from photoagents.core.loop import run_agent_session
from photoagents.core.memory import add_global_fact

def execute_workflow(project_path: str):
    # Step 1: Analyze codebase
    analysis = run_agent_session(
        task_name="code_analysis",
        user_input=f"Analyze Python code structure in {project_path}"
    )
    
    # Save insight to global memory
    add_global_fact(f"Project at {project_path}: {analysis}")
    
    # Step 2: Generate documentation
    docs = run_agent_session(
        task_name="generate_docs",
        user_input=f"Create API documentation for {project_path}"
    )
    
    # Step 3: Run tests
    tests = run_agent_session(
        task_name="run_tests",
        user_input=f"Execute test suite and report coverage"
    )
    
    return {
        "analysis": analysis,
        "documentation": docs,
        "tests": tests
    }
python
from photoagents.core.loop import run_agent_session
from photoagents.core.memory import add_global_fact

def execute_workflow(project_path: str):
    # 步骤1:分析代码库
    analysis = run_agent_session(
        task_name="code_analysis",
        user_input=f"分析{project_path}中的Python代码结构"
    )
    
    # 将洞察保存到全局记忆
    add_global_fact(f"{project_path}项目:{analysis}")
    
    # 步骤2:生成文档
    docs = run_agent_session(
        task_name="generate_docs",
        user_input=f"为{project_path}创建API文档"
    )
    
    # 步骤3:运行测试
    tests = run_agent_session(
        task_name="run_tests",
        user_input=f"执行测试套件并报告覆盖率"
    )
    
    return {
        "analysis": analysis,
        "documentation": docs,
        "tests": tests
    }

Troubleshooting

故障排查

API Key Issues

API密钥问题

Problem:
PhotoAgentsAuthError: Invalid or missing API key
Solution:
bash
undefined
问题:
PhotoAgentsAuthError: Invalid or missing API key
解决方案:
bash
undefined

Verify key is set

验证密钥已设置

echo $PHOTOAGENTS_API_KEY
echo $PHOTOAGENTS_API_KEY

Or check config file

或检查配置文件

cat ~/.photoagents/config.json
cat ~/.photoagents/config.json

Clear cache if key was recently updated

如果密钥最近更新,清除缓存

rm ~/.photoagents/config.json
undefined
rm ~/.photoagents/config.json
undefined

LLM Provider Errors

LLM供应商错误

Problem:
LLM provider authentication failed
Solution:
bash
undefined
问题:
LLM provider authentication failed
解决方案:
bash
undefined

Verify environment variables are set

验证环境变量已设置

echo $ANTHROPIC_API_KEY echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY echo $OPENAI_API_KEY

Test credentials.py is in correct location

验证credentials.py在正确位置

ls credentials.py
ls credentials.py

Check credentials.py syntax

检查credentials.py语法

python -c "from credentials import llm_config; print(llm_config)"
undefined
python -c "from credentials import llm_config; print(llm_config)"
undefined

Memory Issues

记忆问题

Problem: Agent can't recall previous facts
Solution:
python
undefined
问题: Agent无法回忆之前的事实
解决方案:
python
undefined

Check global memory file exists

检查全局记忆文件是否存在

import os print(os.path.exists(os.path.expanduser("~/.photoagents/global_mem.txt")))
import os print(os.path.exists(os.path.expanduser"~/.photoagents/global_mem.txt"))

Manually verify content

手动验证内容

with open(os.path.expanduser("~/.photoagents/global_mem.txt")) as f: print(f.read())
with open(os.path.expanduser"~/.photoagents/global_mem.txt") as f: print(f.read())

Rebuild skill index if corrupted

如果技能索引损坏,重建索引

from photoagents.skills.skill_manager import rebuild_index rebuild_index()
undefined
from photoagents.skills.skill_manager import rebuild_index rebuild_index()
undefined

Browser Automation Fails

浏览器自动化失败

Problem: CDP bridge cannot connect to Chrome
Solution:
bash
undefined
问题: CDP桥无法连接到Chrome
解决方案:
bash
undefined

Ensure Chrome is installed and accessible

确保Chrome已安装且可访问

which google-chrome which chrome
which google-chrome which chrome

Launch Chrome with remote debugging manually

手动启动带远程调试的Chrome

google-chrome --remote-debugging-port=9222
google-chrome --remote-debugging-port=9222

Check port availability

检查端口可用性

lsof -i :9222
undefined
lsof -i :9222
undefined

Session Archive Growth

会话归档占用过大

Problem:
~/.photoagents/sessions/
consuming too much disk
Solution:
bash
undefined
问题:
~/.photoagents/sessions/
占用过多磁盘空间
解决方案:
bash
undefined

Clean old sessions (older than 30 days)

清理30天前的旧会话

find ~/.photoagents/sessions/ -type f -mtime +30 -delete
find ~/.photoagents/sessions/ -type f -mtime +30 -delete

Or configure auto-cleanup

或配置自动清理

python -c " from photoagents.core.cleanup import configure_auto_cleanup configure_auto_cleanup(max_age_days=30, max_size_mb=1000) "
undefined
python -c " from photoagents.core.cleanup import configure_auto_cleanup configure_auto_cleanup(max_age_days=30, max_size_mb=1000) "
undefined

Permission Errors

权限错误

Problem: Cannot write to
~/.photoagents/
Solution:
bash
undefined
问题: 无法写入
~/.photoagents/
解决方案:
bash
undefined

Fix ownership

修复所有权

sudo chown -R $USER:$USER ~/.photoagents/
sudo chown -R $USER:$USER ~/.photoagents/

Fix permissions

修复权限

chmod -R 755 ~/.photoagents/
undefined
chmod -R 755 ~/.photoagents/
undefined

Advanced Configuration

高级配置

Custom Tool Schema

自定义工具Schema

python
from photoagents.resources.tool_schema import register_custom_tool

schema = {
    "name": "analyze_metrics",
    "description": "Analyze system metrics and generate report",
    "parameters": {
        "type": "object",
        "properties": {
            "metric_type": {
                "type": "string",
                "enum": ["cpu", "memory", "disk", "network"]
            },
            "duration_hours": {
                "type": "integer",
                "minimum": 1,
                "maximum": 168
            }
        },
        "required": ["metric_type"]
    }
}

register_custom_tool(schema, implementation_function)
python
from photoagents.resources.tool_schema import register_custom_tool

schema = {
    "name": "analyze_metrics",
    "description": "分析系统指标并生成报告",
    "parameters": {
        "type": "object",
        "properties": {
            "metric_type": {
                "type": "string",
                "enum": ["cpu", "memory", "disk", "network"]
            },
            "duration_hours": {
                "type": "integer",
                "minimum": 1,
                "maximum": 168
            }
        },
        "required": ["metric_type"]
    }
}

register_custom_tool(schema, implementation_function)

Environment Variables Reference

环境变量参考

bash
undefined
bash
undefined

Required

必填

export PHOTOAGENTS_API_KEY=pk_live_xxx
export PHOTOAGENTS_API_KEY=pk_live_xxx

LLM Providers (choose one or both for fallback)

LLM供应商(选择一个或同时配置用于故障转移)

export ANTHROPIC_API_KEY=sk-ant-xxx export OPENAI_API_KEY=sk-xxx
export ANTHROPIC_API_KEY=sk-ant-xxx export OPENAI_API_KEY=sk-xxx

Optional integrations

可选集成

export LANGFUSE_PUBLIC_KEY=pk-lf-xxx export LANGFUSE_SECRET_KEY=sk-lf-xxx export LANGFUSE_HOST=https://cloud.langfuse.com
export LANGFUSE_PUBLIC_KEY=pk-lf-xxx export LANGFUSE_SECRET_KEY=sk-lf-xxx export LANGFUSE_HOST=https://cloud.langfuse.com

Chat platform bots (if using)

聊天平台机器人(如需使用)

export TELEGRAM_BOT_TOKEN=xxx export FEISHU_APP_ID=xxx export FEISHU_APP_SECRET=xxx
undefined
export TELEGRAM_BOT_TOKEN=xxx export FEISHU_APP_ID=xxx export FEISHU_APP_SECRET=xxx
undefined