llm-monitoring-dashboard

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LLM Usage Monitoring Dashboard

LLM用量监控仪表盘

Tracks LLM API costs, tokens, and latency using Tokuin CLI, and auto-generates a data-driven admin dashboard with PM insights.

使用Tokuin CLI追踪LLM API成本、token和延迟,自动生成带PM洞察的数据驱动管理后台。

When to use this skill

什么时候使用这个工具

  • LLM cost visibility: When you want to monitor API usage costs per team or individual in real time
  • PM reporting dashboard: When you need weekly reports on who uses AI, how much, and how
  • User adoption management: When you want to track inactive users and increase AI adoption rates
  • Model optimization evidence: When you need data-driven decisions for model switching or cost reduction
  • Add monitoring tab to admin dashboard: When adding an LLM monitoring section to an existing Admin page

  • LLM成本可视化:当你需要实时监控每个团队或个人的API使用成本时
  • PM报表后台:当你需要输出每周报告,统计谁在使用AI、使用量多少、使用方式是什么时
  • 用户采用率管理:当你需要追踪不活跃用户,提升AI工具采用率时
  • 模型优化依据:当你需要数据支撑模型切换或成本削减的决策时
  • 为管理后台新增监控标签页:当你需要在现有管理页面中加入LLM监控模块时

Prerequisites

前置条件

1. Verify Tokuin CLI installation

1. 验证Tokuin CLI安装状态

bash
undefined
bash
undefined

Check if installed

Check if installed

which tokuin && tokuin --version || echo "Not installed — run Step 1 first"
undefined
which tokuin && tokuin --version || echo "Not installed — run Step 1 first"
undefined

2. Environment variables (only needed for live API calls)

2. 环境变量(仅实时API调用需要)

bash
undefined
bash
undefined

Store in .env file (never hardcode directly in source)

Store in .env file (never hardcode directly in source)

OPENAI_API_KEY=sk-... # OpenAI ANTHROPIC_API_KEY=sk-ant-... # Anthropic OPENROUTER_API_KEY=sk-or-... # OpenRouter (400+ models)
OPENAI_API_KEY=sk-... # OpenAI ANTHROPIC_API_KEY=sk-ant-... # Anthropic OPENROUTER_API_KEY=sk-or-... # OpenRouter (400+ models)

LLM monitoring settings

LLM monitoring settings

LLM_USER_ID=dev-alice # User identifier LLM_USER_ALIAS=Alice # Display name COST_THRESHOLD_USD=10.00 # Cost threshold (alert when exceeded) DASHBOARD_PORT=3000 # Dashboard port MAX_COST_USD=5.00 # Max cost per single run SLACK_WEBHOOK_URL=https://... # For alerts (optional)
undefined
LLM_USER_ID=dev-alice # User identifier LLM_USER_ALIAS=Alice # Display name COST_THRESHOLD_USD=10.00 # Cost threshold (alert when exceeded) DASHBOARD_PORT=3000 # Dashboard port MAX_COST_USD=5.00 # Max cost per single run SLACK_WEBHOOK_URL=https://... # For alerts (optional)
undefined

3. Project stack requirements

3. 项目技术栈要求

Option A (recommended): Next.js 15+ + React 18 + TypeScript
Option B (lightweight): Python 3.8+ + HTML/JavaScript (minimal dependencies)

Option A (recommended): Next.js 15+ + React 18 + TypeScript
Option B (lightweight): Python 3.8+ + HTML/JavaScript (minimal dependencies)

Instructions

使用说明

Step 0: Safety check (always run this first)

步骤0:安全检查(务必先执行)

⚠️ Run this script before executing the skill. Any FAIL items will halt execution.
bash
cat > safety-guard.sh << 'SAFETY_EOF'
#!/usr/bin/env bash
⚠️ 执行工具前先运行此脚本,任何FAIL项都会终止执行。
bash
cat > safety-guard.sh << 'SAFETY_EOF'
#!/usr/bin/env bash

safety-guard.sh — Safety gate before running the LLM monitoring dashboard

safety-guard.sh — Safety gate before running the LLM monitoring dashboard

set -euo pipefail
RED='\033[0;31m'; YELLOW='\033[1;33m'; GREEN='\033[0;32m'; NC='\033[0m' ALLOW_LIVE="${1:-}"; PASS=0; WARN=0; FAIL=0
log_pass() { echo -e "${GREEN}✅ PASS${NC} $1"; ((PASS++)); } log_warn() { echo -e "${YELLOW}⚠️ WARN${NC} $1"; ((WARN++)); } log_fail() { echo -e "${RED}❌ FAIL${NC} $1"; ((FAIL++)); }
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" echo "🛡 LLM Monitoring Dashboard — Safety Guard v1.0" echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
set -euo pipefail
RED='\033[0;31m'; YELLOW='\033[1;33m'; GREEN='\033[0;32m'; NC='\033[0m' ALLOW_LIVE="${1:-}"; PASS=0; WARN=0; FAIL=0
log_pass() { echo -e "${GREEN}✅ PASS${NC} $1"; ((PASS++)); } log_warn() { echo -e "${YELLOW}⚠️ WARN${NC} $1"; ((WARN++)); } log_fail() { echo -e "${RED}❌ FAIL${NC} $1"; ((FAIL++)); }
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" echo "🛡 LLM Monitoring Dashboard — Safety Guard v1.0" echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"

── 1. Check Tokuin CLI installation ────────────────────────────────

── 1. Check Tokuin CLI installation ────────────────────────────────

if command -v tokuin &>/dev/null; then log_pass "Tokuin CLI installed: $(tokuin --version 2>&1 | head -1)" else log_fail "Tokuin not installed → install with the command below and re-run:" echo " curl -fsSL https://raw.githubusercontent.com/nooscraft/tokuin/main/install.sh | bash" fi
if command -v tokuin &>/dev/null; then log_pass "Tokuin CLI installed: $(tokuin --version 2>&1 | head -1)" else log_fail "Tokuin not installed → install with the command below and re-run:" echo " curl -fsSL https://raw.githubusercontent.com/nooscraft/tokuin/main/install.sh | bash" fi

── 2. Detect hardcoded API keys ────────────────────────────────

── 2. Detect hardcoded API keys ────────────────────────────────

HARDCODED=$(grep -rE "(sk-[a-zA-Z0-9]{20,}|sk-ant-[a-zA-Z0-9]{20,}|sk-or-[a-zA-Z0-9]{20,})"
. --include=".ts" --include=".tsx" --include=".js" --include=".jsx"
--include=".html" --include=".sh" --include=".py" --include=".json"
--exclude-dir=node_modules --exclude-dir=.git 2>/dev/null
| grep -v ".env" | grep -v "example" | wc -l || echo 0) if [ "$HARDCODED" -eq 0 ]; then log_pass "No hardcoded API keys found" else log_fail "⚠️ ${HARDCODED} hardcoded API key(s) detected! → Move to environment variables (.env) immediately" grep -rE "(sk-[a-zA-Z0-9]{20,})" .
--include=".ts" --include=".js" --include="*.html"
--exclude-dir=node_modules 2>/dev/null | head -5 || true fi
HARDCODED=$(grep -rE "(sk-[a-zA-Z0-9]{20,}|sk-ant-[a-zA-Z0-9]{20,}|sk-or-[a-zA-Z0-9]{20,})"
. --include=".ts" --include=".tsx" --include=".js" --include=".jsx"
--include=".html" --include=".sh" --include=".py" --include=".json"
--exclude-dir=node_modules --exclude-dir=.git 2>/dev/null
| grep -v ".env" | grep -v "example" | wc -l || echo 0) if [ "$HARDCODED" -eq 0 ]; then log_pass "No hardcoded API keys found" else log_fail "⚠️ ${HARDCODED} hardcoded API key(s) detected! → Move to environment variables (.env) immediately" grep -rE "(sk-[a-zA-Z0-9]{20,})" .
--include=".ts" --include=".js" --include="*.html"
--exclude-dir=node_modules 2>/dev/null | head -5 || true fi

── 3. Check .env is in .gitignore ────────────────────────────

── 3. Check .env is in .gitignore ────────────────────────────

if [ -f .env ]; then if [ -f .gitignore ] && grep -q ".env" .gitignore; then log_pass ".env is listed in .gitignore" else log_fail ".env exists but is not in .gitignore! → echo '.env' >> .gitignore" fi else log_warn ".env file not found — create one before making live API calls" fi
if [ -f .env ]; then if [ -f .gitignore ] && grep -q ".env" .gitignore; then log_pass ".env is listed in .gitignore" else log_fail ".env exists but is not in .gitignore! → echo '.env' >> .gitignore" fi else log_warn ".env file not found — create one before making live API calls" fi

── 4. Check live API call mode ────────────────────────────

── 4. Check live API call mode ────────────────────────────

if [ "$ALLOW_LIVE" = "--allow-live" ]; then log_warn "Live API call mode enabled! Costs will be incurred." log_warn "Max cost threshold: $${MAX_COST_USD:-5.00} (adjust via MAX_COST_USD env var)" read -p " Allow live API calls? [y/N] " -r echo [[ $REPLY =~ ^[Yy]$ ]] || { echo "Cancelled. Re-run in dry-run mode."; exit 1; } else log_pass "dry-run mode (default) — no API costs incurred" fi
if [ "$ALLOW_LIVE" = "--allow-live" ]; then log_warn "Live API call mode enabled! Costs will be incurred." log_warn "Max cost threshold: $${MAX_COST_USD:-5.00} (adjust via MAX_COST_USD env var)" read -p " Allow live API calls? [y/N] " -r echo [[ $REPLY =~ ^[Yy]$ ]] || { echo "Cancelled. Re-run in dry-run mode."; exit 1; } else log_pass "dry-run mode (default) — no API costs incurred" fi

── 5. Check port conflicts ─────────────────────────────────────

── 5. Check port conflicts ─────────────────────────────────────

PORT="${DASHBOARD_PORT:-3000}" if lsof -i ":${PORT}" &>/dev/null 2>&1; then ALT_PORT=$((PORT + 1)) log_warn "Port ${PORT} is in use → use ${ALT_PORT} instead: export DASHBOARD_PORT=${ALT_PORT}" else log_pass "Port ${PORT} is available" fi
PORT="${DASHBOARD_PORT:-3000}" if lsof -i ":${PORT}" &>/dev/null 2>&1; then ALT_PORT=$((PORT + 1)) log_warn "Port ${PORT} is in use → use ${ALT_PORT} instead: export DASHBOARD_PORT=${ALT_PORT}" else log_pass "Port ${PORT} is available" fi

── 6. Initialize data/ directory ──────────────────────────────

── 6. Initialize data/ directory ──────────────────────────────

mkdir -p ./data if [ -f ./data/metrics.jsonl ]; then BYTES=$(wc -c < ./data/metrics.jsonl || echo 0) if [ "$BYTES" -gt 10485760 ]; then log_warn "metrics.jsonl exceeds 10MB (${BYTES}B) → consider applying a rolling policy" echo " cp data/metrics.jsonl data/metrics-$(date +%Y%m%d).jsonl.bak && > data/metrics.jsonl" else log_pass "data/ ready (metrics.jsonl: ${BYTES}B)" fi else log_pass "data/ ready (new)" fi
mkdir -p ./data if [ -f ./data/metrics.jsonl ]; then BYTES=$(wc -c < ./data/metrics.jsonl || echo 0) if [ "$BYTES" -gt 10485760 ]; then log_warn "metrics.jsonl exceeds 10MB (${BYTES}B) → consider applying a rolling policy" echo " cp data/metrics.jsonl data/metrics-$(date +%Y%m%d).jsonl.bak && > data/metrics.jsonl" else log_pass "data/ ready (metrics.jsonl: ${BYTES}B)" fi else log_pass "data/ ready (new)" fi

── Summary ─────────────────────────────────────────────

── Summary ─────────────────────────────────────────────

echo "" echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" echo -e "Result: ${GREEN}PASS $PASS${NC} / ${YELLOW}WARN $WARN${NC} / ${RED}FAIL $FAIL${NC}" if [ "$FAIL" -gt 0 ]; then echo -e "${RED}❌ Safety check failed. Resolve the FAIL items above and re-run.${NC}" exit 1 else echo -e "${GREEN}✅ Safety check passed. Continuing skill execution.${NC}" exit 0 fi SAFETY_EOF chmod +x safety-guard.sh
echo "" echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" echo -e "Result: ${GREEN}PASS $PASS${NC} / ${YELLOW}WARN $WARN${NC} / ${RED}FAIL $FAIL${NC}" if [ "$FAIL" -gt 0 ]; then echo -e "${RED}❌ Safety check failed. Resolve the FAIL items above and re-run.${NC}" exit 1 else echo -e "${GREEN}✅ Safety check passed. Continuing skill execution.${NC}" exit 0 fi SAFETY_EOF chmod +x safety-guard.sh

Run (halts immediately if any FAIL)

Run (halts immediately if any FAIL)

bash safety-guard.sh

---
bash safety-guard.sh

---

Step 1: Install Tokuin CLI and verify with dry-run

步骤1:安装Tokuin CLI并通过空运行验证

bash
undefined
bash
undefined

1-1. Install (macOS / Linux)

1-1. Install (macOS / Linux)

Windows PowerShell:

Windows PowerShell:

1-2. Verify installation

1-2. Verify installation

tokuin --version which tokuin # expected: /usr/local/bin/tokuin or ~/.local/bin/tokuin
tokuin --version which tokuin # expected: /usr/local/bin/tokuin or ~/.local/bin/tokuin

1-3. Basic token count test

1-3. Basic token count test

echo "Hello, world!" | tokuin --model gpt-4
echo "Hello, world!" | tokuin --model gpt-4

1-4. dry-run cost estimate (no API key needed ✅)

1-4. dry-run cost estimate (no API key needed ✅)

echo "Analyze user behavior patterns from the following data" |
tokuin load-test
--model gpt-4
--runs 50
--concurrency 5
--dry-run
--estimate-cost
--output-format json | python3 -m json.tool
echo "Analyze user behavior patterns from the following data" |
tokuin load-test
--model gpt-4
--runs 50
--concurrency 5
--dry-run
--estimate-cost
--output-format json | python3 -m json.tool

Expected output structure:

Expected output structure:

{

{

"total_requests": 50,

"total_requests": 50,

"successful": 50,

"successful": 50,

"failed": 0,

"failed": 0,

"latency_ms": { "average": ..., "p50": ..., "p95": ... },

"latency_ms": { "average": ..., "p50": ..., "p95": ... },

"cost": { "input_tokens": ..., "output_tokens": ..., "total_cost": ... }

"cost": { "input_tokens": ..., "output_tokens": ..., "total_cost": ... }

}

}

1-5. Multi-model comparison (dry-run)

1-5. Multi-model comparison (dry-run)

echo "Translate this to Korean" | tokuin --compare gpt-4 gpt-3.5-turbo claude-3-haiku --price
echo "Translate this to Korean" | tokuin --compare gpt-4 gpt-3.5-turbo claude-3-haiku --price

1-6. Verify Prometheus format output

1-6. Verify Prometheus format output

echo "Benchmark" | tokuin load-test --model gpt-4 --runs 10 --dry-run --output-format prometheus
echo "Benchmark" | tokuin load-test --model gpt-4 --runs 10 --dry-run --output-format prometheus

Expected: "# HELP", "# TYPE", metrics with "tokuin_" prefix

Expected: "# HELP", "# TYPE", metrics with "tokuin_" prefix


---

---

Step 2: Data collection pipeline with user context

步骤2:带用户上下文的数据采集 pipeline

bash
undefined
bash
undefined

2-1. Create prompt auto-categorization module

2-1. Create prompt auto-categorization module

cat > categorize_prompt.py << 'PYEOF' #!/usr/bin/env python3 """Auto-categorize prompts based on keywords""" import hashlib
CATEGORIES = { "coding": ["code", "function", "class", "implement", "debug", "fix", "refactor"], "analysis": ["analyze", "compare", "evaluate", "assess"], "translation": ["translate", "translation"], "summary": ["summarize", "summary", "tldr", "brief"], "writing": ["write", "draft", "create", "generate"], "question": ["what is", "how to", "explain", "why"], "data": ["data", "table", "csv", "json", "sql"], }
def categorize(prompt: str) -> str: p = prompt.lower() for cat, keywords in CATEGORIES.items(): if any(k in p for k in keywords): return cat return "other"
def hash_prompt(prompt: str) -> str: """First 16 chars of SHA-256 (stored instead of raw text — privacy protection)""" return hashlib.sha256(prompt.encode()).hexdigest()[:16]
def truncate_preview(prompt: str, limit: int = 100) -> str: return prompt[:limit] + ("…" if len(prompt) > limit else "")
if name == "main": import sys prompt = sys.argv[1] if len(sys.argv) > 1 else "" print(categorize(prompt)) PYEOF
cat > categorize_prompt.py << 'PYEOF' #!/usr/bin/env python3 """Auto-categorize prompts based on keywords""" import hashlib
CATEGORIES = { "coding": ["code", "function", "class", "implement", "debug", "fix", "refactor"], "analysis": ["analyze", "compare", "evaluate", "assess"], "translation": ["translate", "translation"], "summary": ["summarize", "summary", "tldr", "brief"], "writing": ["write", "draft", "create", "generate"], "question": ["what is", "how to", "explain", "why"], "data": ["data", "table", "csv", "json", "sql"], }
def categorize(prompt: str) -> str: p = prompt.lower() for cat, keywords in CATEGORIES.items(): if any(k in p for k in keywords): return cat return "other"
def hash_prompt(prompt: str) -> str: """First 16 chars of SHA-256 (stored instead of raw text — privacy protection)""" return hashlib.sha256(prompt.encode()).hexdigest()[:16]
def truncate_preview(prompt: str, limit: int = 100) -> str: return prompt[:limit] + ("…" if len(prompt) > limit else "")
if name == "main": import sys prompt = sys.argv[1] if len(sys.argv) > 1 else "" print(categorize(prompt)) PYEOF

2-2. Create metrics collection script with user context

2-2. Create metrics collection script with user context

cat > collect-metrics.sh << 'COLLECT_EOF' #!/usr/bin/env bash
cat > collect-metrics.sh << 'COLLECT_EOF' #!/usr/bin/env bash

collect-metrics.sh — Run Tokuin and save with user context (dry-run by default)

collect-metrics.sh — Run Tokuin and save with user context (dry-run by default)

set -euo pipefail
set -euo pipefail

User info

User info

USER_ID="${LLM_USER_ID:-$(whoami)}" USER_ALIAS="${LLM_USER_ALIAS:-$USER_ID}" SESSION_ID="${LLM_SESSION_ID:-$(date +%Y%m%d-%H%M%S)-$$}" PROMPT="${1:-Benchmark prompt}" MODEL="${MODEL:-gpt-4}" PROVIDER="${PROVIDER:-openai}" RUNS="${RUNS:-50}" CONCURRENCY="${CONCURRENCY:-5}" TAGS="${LLM_TAGS:-[]}"
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ") CATEGORY=$(python3 categorize_prompt.py "$PROMPT" 2>/dev/null || echo "other") PROMPT_HASH=$(echo -n "$PROMPT" | sha256sum | cut -c1-16 2>/dev/null || echo "unknown") PROMPT_LEN=${#PROMPT}
USER_ID="${LLM_USER_ID:-$(whoami)}" USER_ALIAS="${LLM_USER_ALIAS:-$USER_ID}" SESSION_ID="${LLM_SESSION_ID:-$(date +%Y%m%d-%H%M%S)-$$}" PROMPT="${1:-Benchmark prompt}" MODEL="${MODEL:-gpt-4}" PROVIDER="${PROVIDER:-openai}" RUNS="${RUNS:-50}" CONCURRENCY="${CONCURRENCY:-5}" TAGS="${LLM_TAGS:-[]}"
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ") CATEGORY=$(python3 categorize_prompt.py "$PROMPT" 2>/dev/null || echo "other") PROMPT_HASH=$(echo -n "$PROMPT" | sha256sum | cut -c1-16 2>/dev/null || echo "unknown") PROMPT_LEN=${#PROMPT}

Run Tokuin (dry-run by default)

Run Tokuin (dry-run by default)

RESULT=$(echo "$PROMPT" | tokuin load-test
--model "$MODEL"
--provider "$PROVIDER"
--runs "$RUNS"
--concurrency "$CONCURRENCY"
--output-format json
${ALLOW_LIVE:+""} ${ALLOW_LIVE:-"--dry-run --estimate-cost"} 2>/dev/null)
RESULT=$(echo "$PROMPT" | tokuin load-test
--model "$MODEL"
--provider "$PROVIDER"
--runs "$RUNS"
--concurrency "$CONCURRENCY"
--output-format json
${ALLOW_LIVE:+""} ${ALLOW_LIVE:-"--dry-run --estimate-cost"} 2>/dev/null)

Save to JSONL with user context

Save to JSONL with user context

python3 - << PYEOF import json, sys
result = json.loads('''${RESULT}''') latency = result.get("latency_ms", {}) cost = result.get("cost", {})
record = { "id": "${PROMPT_HASH}-${SESSION_ID}", "timestamp": "${TIMESTAMP}", "model": "${MODEL}", "provider": "${PROVIDER}", "user_id": "${USER_ID}", "user_alias": "${USER_ALIAS}", "session_id": "${SESSION_ID}", "prompt_hash": "${PROMPT_HASH}", "prompt_category": "${CATEGORY}", "prompt_length": ${PROMPT_LEN}, "tags": json.loads('${TAGS}'), "is_dry_run": True, "total_requests": result.get("total_requests", 0), "successful": result.get("successful", 0), "failed": result.get("failed", 0), "input_tokens": cost.get("input_tokens", 0), "output_tokens": cost.get("output_tokens", 0), "cost_usd": cost.get("total_cost", 0), "latency_avg_ms": latency.get("average", 0), "latency_p50_ms": latency.get("p50", 0), "latency_p95_ms": latency.get("p95", 0), "status_code": 200 if result.get("successful", 0) > 0 else 500, }
with open("./data/metrics.jsonl", "a") as f: f.write(json.dumps(record, ensure_ascii=False) + "\n")
print(f"✅ Saved: [{record['user_alias']}] {record['prompt_category']} | ${record['cost_usd']:.4f} | {record['latency_avg_ms']:.0f}ms") PYEOF COLLECT_EOF chmod +x collect-metrics.sh
python3 - << PYEOF import json, sys
result = json.loads('''${RESULT}''') latency = result.get("latency_ms", {}) cost = result.get("cost", {})
record = { "id": "${PROMPT_HASH}-${SESSION_ID}", "timestamp": "${TIMESTAMP}", "model": "${MODEL}", "provider": "${PROVIDER}", "user_id": "${USER_ID}", "user_alias": "${USER_ALIAS}", "session_id": "${SESSION_ID}", "prompt_hash": "${PROMPT_HASH}", "prompt_category": "${CATEGORY}", "prompt_length": ${PROMPT_LEN}, "tags": json.loads('${TAGS}'), "is_dry_run": True, "total_requests": result.get("total_requests", 0), "successful": result.get("successful", 0), "failed": result.get("failed", 0), "input_tokens": cost.get("input_tokens", 0), "output_tokens": cost.get("output_tokens", 0), "cost_usd": cost.get("total_cost", 0), "latency_avg_ms": latency.get("average", 0), "latency_p50_ms": latency.get("p50", 0), "latency_p95_ms": latency.get("p95", 0), "status_code": 200 if result.get("successful", 0) > 0 else 500, }
with open("./data/metrics.jsonl", "a") as f: f.write(json.dumps(record, ensure_ascii=False) + "\n")
print(f"✅ Saved: [{record['user_alias']}] {record['prompt_category']} | ${record['cost_usd']:.4f} | {record['latency_avg_ms']:.0f}ms") PYEOF COLLECT_EOF chmod +x collect-metrics.sh

2-3. Set up cron (auto-collect every 5 minutes)

2-3. Set up cron (auto-collect every 5 minutes)

(crontab -l 2>/dev/null; echo "*/5 * * * * cd $(pwd) && bash collect-metrics.sh 'Scheduled benchmark' >> ./data/collect.log 2>&1") | crontab - echo "✅ Cron registered (every 5 minutes)"
(crontab -l 2>/dev/null; echo "*/5 * * * * cd $(pwd) && bash collect-metrics.sh 'Scheduled benchmark' >> ./data/collect.log 2>&1") | crontab - echo "✅ Cron registered (every 5 minutes)"

2-4. First collection test (dry-run)

2-4. First collection test (dry-run)

bash collect-metrics.sh "Analyze user behavior patterns" cat ./data/metrics.jsonl | python3 -m json.tool | head -30

---
bash collect-metrics.sh "Analyze user behavior patterns" cat ./data/metrics.jsonl | python3 -m json.tool | head -30

---

Step 3: Routing structure and dashboard frame

步骤3:路由结构和仪表盘框架

Option A — Next.js (recommended)
bash
undefined
选项A — Next.js(推荐)
bash
undefined

3-1. Initialize Next.js project (skip this if adding to an existing project)

3-1. Initialize Next.js project (skip this if adding to an existing project)

npx create-next-app@latest llm-dashboard
--typescript
--tailwind
--app
--no-src-dir cd llm-dashboard
npx create-next-app@latest llm-dashboard
--typescript
--tailwind
--app
--no-src-dir cd llm-dashboard

3-2. Install dependencies

3-2. Install dependencies

npm install recharts better-sqlite3 @types/better-sqlite3
npm install recharts better-sqlite3 @types/better-sqlite3

3-3. Set design tokens (consistent tone and style)

3-3. Set design tokens (consistent tone and style)

cat > app/globals.css << 'CSS_EOF' :root { /* Background layers */ --bg-base: #0f1117; --bg-surface: #1a1d27; --bg-elevated: #21253a; --border: rgba(255, 255, 255, 0.06);
/* Text layers */ --text-primary: #f1f5f9; --text-secondary: #94a3b8; --text-muted: #475569;
/* 3-level traffic light system (use consistently across all components) / --color-ok: #22c55e; / Normal — Green 500 / --color-warn: #f59e0b; / Warning — Amber 500 / --color-danger: #ef4444; / Danger — Red 500 / --color-neutral: #60a5fa; / Neutral — Blue 400 */
/* Data series colors (colorblind-friendly palette) / --series-1: #818cf8; / Indigo — System/GPT-4 / --series-2: #38bdf8; / Sky — User/Claude / --series-3: #34d399; / Emerald — Assistant/Gemini*/ --series-4: #fb923c; /* Orange — 4th series */
/* Cost-specific */ --cost-input: #a78bfa; --cost-output: #f472b6;
/* Ranking colors */ --rank-gold: #fbbf24; --rank-silver: #94a3b8; --rank-bronze: #b45309; --rank-inactive: #374151;
/* Typography */ --font-mono: 'JetBrains Mono', 'Fira Code', monospace; --font-ui: 'Geist', 'Plus Jakarta Sans', system-ui, sans-serif; }
body { background: var(--bg-base); color: var(--text-primary); font-family: var(--font-ui); }
/* Numbers: alignment stability */ .metric-value { font-family: var(--font-mono); font-variant-numeric: tabular-nums; font-feature-settings: 'tnum'; }
/* KPI card accent-bar */ .status-ok { border-left-color: var(--color-ok); } .status-warn { border-left-color: var(--color-warn); } .status-danger { border-left-color: var(--color-danger); } CSS_EOF
cat > app/globals.css << 'CSS_EOF' :root { /* Background layers */ --bg-base: #0f1117; --bg-surface: #1a1d27; --bg-elevated: #21253a; --border: rgba(255, 255, 255, 0.06);
/* Text layers */ --text-primary: #f1f5f9; --text-secondary: #94a3b8; --text-muted: #475569;
/* 3-level traffic light system (use consistently across all components) / --color-ok: #22c55e; / Normal — Green 500 / --color-warn: #f59e0b; / Warning — Amber 500 / --color-danger: #ef4444; / Danger — Red 500 / --color-neutral: #60a5fa; / Neutral — Blue 400 */
/* Data series colors (colorblind-friendly palette) / --series-1: #818cf8; / Indigo — System/GPT-4 / --series-2: #38bdf8; / Sky — User/Claude / --series-3: #34d399; / Emerald — Assistant/Gemini*/ --series-4: #fb923c; /* Orange — 4th series */
/* Cost-specific */ --cost-input: #a78bfa; --cost-output: #f472b6;
/* Ranking colors */ --rank-gold: #fbbf24; --rank-silver: #94a3b8; --rank-bronze: #b45309; --rank-inactive: #374151;
/* Typography */ --font-mono: 'JetBrains Mono', 'Fira Code', monospace; --font-ui: 'Geist', 'Plus Jakarta Sans', system-ui, sans-serif; }
body { background: var(--bg-base); color: var(--text-primary); font-family: var(--font-ui); }
/* Numbers: alignment stability */ .metric-value { font-family: var(--font-mono); font-variant-numeric: tabular-nums; font-feature-settings: 'tnum'; }
/* KPI card accent-bar */ .status-ok { border-left-color: var(--color-ok); } .status-warn { border-left-color: var(--color-warn); } .status-danger { border-left-color: var(--color-danger); } CSS_EOF

3-4. Create routing structure

3-4. Create routing structure

mkdir -p app/admin/llm-monitoring mkdir -p app/admin/llm-monitoring/users mkdir -p "app/admin/llm-monitoring/users/[userId]" mkdir -p "app/admin/llm-monitoring/runs/[runId]" mkdir -p components/llm-monitoring mkdir -p lib/llm-monitoring
mkdir -p app/admin/llm-monitoring mkdir -p app/admin/llm-monitoring/users mkdir -p "app/admin/llm-monitoring/users/[userId]" mkdir -p "app/admin/llm-monitoring/runs/[runId]" mkdir -p components/llm-monitoring mkdir -p lib/llm-monitoring

3-5. Initialize SQLite DB

3-5. Initialize SQLite DB

cat > lib/llm-monitoring/db.ts << 'TS_EOF' import Database from 'better-sqlite3' import path from 'path'
const DB_PATH = path.join(process.cwd(), 'data', 'monitoring.db')
const db = new Database(DB_PATH)
db.exec(` CREATE TABLE IF NOT EXISTS runs ( id TEXT PRIMARY KEY, timestamp DATETIME NOT NULL DEFAULT (datetime('now')), model TEXT NOT NULL, provider TEXT NOT NULL, user_id TEXT DEFAULT 'anonymous', user_alias TEXT DEFAULT 'anonymous', session_id TEXT, prompt_hash TEXT, prompt_category TEXT DEFAULT 'other', prompt_length INTEGER DEFAULT 0, tags TEXT DEFAULT '[]', is_dry_run INTEGER DEFAULT 1, total_requests INTEGER DEFAULT 0, successful INTEGER DEFAULT 0, failed INTEGER DEFAULT 0, input_tokens INTEGER DEFAULT 0, output_tokens INTEGER DEFAULT 0, cost_usd REAL DEFAULT 0, latency_avg_ms REAL DEFAULT 0, latency_p50_ms REAL DEFAULT 0, latency_p95_ms REAL DEFAULT 0, status_code INTEGER DEFAULT 200 );
CREATE TABLE IF NOT EXISTS user_profiles ( user_id TEXT PRIMARY KEY, user_alias TEXT NOT NULL, team TEXT DEFAULT '', role TEXT DEFAULT 'user', created_at DATETIME DEFAULT (datetime('now')), last_seen DATETIME, notes TEXT DEFAULT '' );
CREATE INDEX IF NOT EXISTS idx_runs_timestamp ON runs(timestamp DESC); CREATE INDEX IF NOT EXISTS idx_runs_user_id ON runs(user_id); CREATE INDEX IF NOT EXISTS idx_runs_model ON runs(model);
CREATE VIEW IF NOT EXISTS user_stats AS SELECT user_id, user_alias, COUNT(*) AS total_runs, SUM(input_tokens + output_tokens) AS total_tokens, ROUND(SUM(cost_usd), 4) AS total_cost, ROUND(AVG(latency_avg_ms), 1) AS avg_latency, ROUND(AVG(CAST(successful AS REAL) / NULLIF(total_requests, 0) * 100), 1) AS success_rate, COUNT(DISTINCT model) AS models_used, MAX(timestamp) AS last_seen FROM runs GROUP BY user_id; `)
export default db TS_EOF

**Option B — Lightweight HTML (minimal dependencies)**

```bash
cat > lib/llm-monitoring/db.ts << 'TS_EOF' import Database from 'better-sqlite3' import path from 'path'
const DB_PATH = path.join(process.cwd(), 'data', 'monitoring.db')
const db = new Database(DB_PATH)
db.exec(` CREATE TABLE IF NOT EXISTS runs ( id TEXT PRIMARY KEY, timestamp DATETIME NOT NULL DEFAULT (datetime('now')), model TEXT NOT NULL, provider TEXT NOT NULL, user_id TEXT DEFAULT 'anonymous', user_alias TEXT DEFAULT 'anonymous', session_id TEXT, prompt_hash TEXT, prompt_category TEXT DEFAULT 'other', prompt_length INTEGER DEFAULT 0, tags TEXT DEFAULT '[]', is_dry_run INTEGER DEFAULT 1, total_requests INTEGER DEFAULT 0, successful INTEGER DEFAULT 0, failed INTEGER DEFAULT 0, input_tokens INTEGER DEFAULT 0, output_tokens INTEGER DEFAULT 0, cost_usd REAL DEFAULT 0, latency_avg_ms REAL DEFAULT 0, latency_p50_ms REAL DEFAULT 0, latency_p95_ms REAL DEFAULT 0, status_code INTEGER DEFAULT 200 );
CREATE TABLE IF NOT EXISTS user_profiles ( user_id TEXT PRIMARY KEY, user_alias TEXT NOT NULL, team TEXT DEFAULT '', role TEXT DEFAULT 'user', created_at DATETIME DEFAULT (datetime('now')), last_seen DATETIME, notes TEXT DEFAULT '' );
CREATE INDEX IF NOT EXISTS idx_runs_timestamp ON runs(timestamp DESC); CREATE INDEX IF NOT EXISTS idx_runs_user_id ON runs(user_id); CREATE INDEX IF NOT EXISTS idx_runs_model ON runs(model);
CREATE VIEW IF NOT EXISTS user_stats AS SELECT user_id, user_alias, COUNT(*) AS total_runs, SUM(input_tokens + output_tokens) AS total_tokens, ROUND(SUM(cost_usd), 4) AS total_cost, ROUND(AVG(latency_avg_ms), 1) AS avg_latency, ROUND(AVG(CAST(successful AS REAL) / NULLIF(total_requests, 0) * 100), 1) AS success_rate, COUNT(DISTINCT model) AS models_used, MAX(timestamp) AS last_seen FROM runs GROUP BY user_id; `)
export default db TS_EOF

**选项B — 轻量HTML(最小依赖)**

```bash

Use this when there's no existing project or you need a quick prototype

Use this when there's no existing project or you need a quick prototype

mkdir -p llm-monitoring/data
cat > llm-monitoring/index.html << 'HTML_EOF'
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>🧮 LLM Usage Monitoring</title> <script src="https://cdn.jsdelivr.net/npm/chart.js@4/dist/chart.umd.min.js"></script> <link rel="preconnect" href="https://fonts.googleapis.com"> <link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;600&display=swap" rel="stylesheet"> <style> /* Design tokens */ :root { --bg-base: #0f1117; --bg-surface: #1a1d27; --bg-elevated: #21253a; --text-primary: #f1f5f9; --text-secondary: #94a3b8; --text-muted: #475569; --color-ok: #22c55e; --color-warn: #f59e0b; --color-danger: #ef4444; --series-1: #818cf8; --series-2: #38bdf8; --series-3: #34d399; --series-4: #fb923c; --rank-gold: #fbbf24; --rank-silver: #94a3b8; --rank-bronze: #b45309; --font-mono: 'JetBrains Mono', monospace; } * { box-sizing: border-box; margin: 0; padding: 0; } body { background: var(--bg-base); color: var(--text-primary); font-family: system-ui, sans-serif; padding: 24px; } header { display: flex; justify-content: space-between; align-items: center; margin-bottom: 32px; } header h1 { font-size: 1.5rem; font-weight: 700; color: #60a5fa; } .kpi-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 16px; margin-bottom: 24px; } @media (max-width: 768px) { .kpi-grid { grid-template-columns: repeat(2, 1fr); } } @media (max-width: 480px) { .kpi-grid { grid-template-columns: 1fr; } } .kpi-card { background: var(--bg-surface); border: 1px solid rgba(255,255,255,0.06); border-left: 3px solid var(--color-neutral, #60a5fa); border-radius: 12px; padding: 20px; } .kpi-card.ok { border-left-color: var(--color-ok); } .kpi-card.warn { border-left-color: var(--color-warn); } .kpi-card.danger { border-left-color: var(--color-danger); } .kpi-label { font-size: 0.625rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--text-muted); margin-bottom: 8px; } .kpi-value { font-family: var(--font-mono); font-size: 2rem; font-weight: 700; font-variant-numeric: tabular-nums; } .kpi-sub { font-size: 0.75rem; color: var(--text-secondary); margin-top: 4px; } .chart-row { display: grid; grid-template-columns: 2fr 1fr; gap: 16px; margin-bottom: 24px; } @media (max-width: 900px) { .chart-row { grid-template-columns: 1fr; } } .chart-card { background: var(--bg-surface); border: 1px solid rgba(255,255,255,0.06); border-radius: 12px; padding: 20px; } .chart-card h3 { font-size: 0.75rem; color: var(--text-secondary); margin-bottom: 16px; text-transform: uppercase; letter-spacing: 0.05em; } .ranking-table { width: 100%; border-collapse: collapse; } .ranking-table th { font-size: 0.625rem; text-transform: uppercase; color: var(--text-muted); padding: 8px 12px; text-align: left; border-bottom: 1px solid rgba(255,255,255,0.06); } .ranking-table td { padding: 12px; border-bottom: 1px solid rgba(255,255,255,0.04); font-family: var(--font-mono); font-size: 0.875rem; } .ranking-table tr:hover td { background: var(--bg-elevated); } .user-link { color: #60a5fa; text-decoration: none; cursor: pointer; } .user-link:hover { text-decoration: underline; } .badge { display: inline-block; padding: 2px 8px; border-radius: 4px; font-size: 0.7rem; } .badge-ok { background: rgba(34,197,94,0.1); color: var(--color-ok); } .badge-warn { background: rgba(245,158,11,0.1); color: var(--color-warn); } .badge-danger { background: rgba(239,68,68,0.1); color: var(--color-danger); } .rank-1 { color: var(--rank-gold); } .rank-2 { color: var(--rank-silver); } .rank-3 { color: var(--rank-bronze); } .insight-box { background: rgba(96,165,250,0.05); border: 1px solid rgba(96,165,250,0.15); border-radius: 8px; padding: 16px; margin-top: 8px; } .insight-box h4 { font-size: 0.75rem; color: #60a5fa; margin-bottom: 8px; } .insight-box ul { font-size: 0.8rem; color: var(--text-secondary); padding-left: 16px; } .insight-box ul li { margin-bottom: 4px; } .section-title { font-size: 1rem; font-weight: 600; margin: 24px 0 12px; } #user-detail { display: none; background: var(--bg-surface); border: 1px solid rgba(255,255,255,0.06); border-radius: 12px; padding: 24px; margin-top: 16px; } .back-btn { background: none; border: 1px solid rgba(255,255,255,0.1); color: var(--text-secondary); padding: 6px 12px; border-radius: 6px; cursor: pointer; font-size: 0.8rem; margin-bottom: 16px; } .back-btn:hover { background: var(--bg-elevated); } </style> </head> <body> <header> <div> <h1>🧮 LLM Usage Monitoring</h1> <p style="font-size:0.75rem;color:#475569;margin-top:4px;">Powered by Tokuin CLI</p> </div> <div style="display:flex;gap:8px;align-items:center;"> <span id="last-updated" style="font-size:0.75rem;color:#475569;"></span> <button onclick="loadData()" style="background:rgba(96,165,250,0.1);border:1px solid rgba(96,165,250,0.2);color:#60a5fa;padding:6px 14px;border-radius:6px;cursor:pointer;font-size:0.8rem;">↻ Refresh</button> </div> </header> <!-- Main dashboard --> <div id="main-dashboard"> <!-- 4 KPI cards --> <div class="kpi-grid"> <div class="kpi-card" id="kpi-requests"> <div class="kpi-label">Total Requests</div> <div class="kpi-value metric-value" id="val-requests">-</div> <div class="kpi-sub" id="sub-requests">Loading data...</div> </div> <div class="kpi-card" id="kpi-success"> <div class="kpi-label">Success Rate</div> <div class="kpi-value metric-value" id="val-success">-</div> <div class="kpi-sub" id="sub-success">-</div> </div> <div class="kpi-card" id="kpi-latency"> <div class="kpi-label">p95 Latency</div> <div class="kpi-value metric-value" id="val-latency">-</div> <div class="kpi-sub" id="sub-latency">-</div> </div> <div class="kpi-card" id="kpi-cost"> <div class="kpi-label">Total Cost</div> <div class="kpi-value metric-value" id="val-cost">-</div> <div class="kpi-sub" id="sub-cost">-</div> </div> </div>
<!-- Chart row -->
<div class="chart-row">
  <div class="chart-card">
    <h3>Cost Trend Over Time</h3>
    <canvas id="trend-chart" height="160"></canvas>
  </div>
  <div class="chart-card">
    <h3>Category Distribution</h3>
    <canvas id="category-chart" height="160"></canvas>
  </div>
</div>

<!-- User ranking -->
<h2 class="section-title">🏆 User Ranking</h2>
<div class="chart-card" style="margin-bottom:24px;">
  <div style="display:flex;justify-content:space-between;align-items:center;margin-bottom:12px;">
    <h3 style="margin-bottom:0;">Ranked by Cost</h3>
    <input id="user-search" type="text" placeholder="🔍 Search users..." 
      style="background:var(--bg-elevated);border:1px solid rgba(255,255,255,0.08);color:var(--text-primary);padding:6px 12px;border-radius:6px;font-size:0.8rem;width:200px;"
      oninput="filterRanking(this.value)">
  </div>
  <table class="ranking-table" id="ranking-table">
    <thead>
      <tr>
        <th>Rank</th>
        <th>User</th>
        <th>Cost</th>
        <th>Requests</th>
        <th>Top Model</th>
        <th>Success Rate</th>
        <th>Last Active</th>
      </tr>
    </thead>
    <tbody id="ranking-body">
      <tr><td colspan="7" style="text-align:center;color:#475569;padding:24px;">Loading data...</td></tr>
    </tbody>
  </table>
</div>

<!-- Inactive user tracking -->
<h2 class="section-title">💤 Inactive Users</h2>
<div class="chart-card" style="margin-bottom:24px;">
  <table class="ranking-table" id="inactive-table">
    <thead>
      <tr><th>User</th><th>Inactive For</th><th>Last Active</th><th>Status</th></tr>
    </thead>
    <tbody id="inactive-body">
      <tr><td colspan="4" style="text-align:center;color:#475569;padding:24px;">No tracking data</td></tr>
    </tbody>
  </table>
</div>

<!-- PM insights -->
<h2 class="section-title">📊 PM Auto Insights</h2>
<div id="pm-insights">
  <div class="insight-box">
    <h4>💡 Analyzing automatically...</h4>
  </div>
</div>
</div> <!-- Per-user detail page (shown on link click) --> <div id="user-detail"> <button class="back-btn" onclick="showMain()">← Back to Dashboard</button> <div id="user-detail-content"></div> </div> <script> let allData = []; let allUsers = {}; async function loadData() { try { const res = await fetch('./data/metrics.jsonl'); const text = await res.text(); allData = text.trim().split('\n').filter(Boolean).map(l => JSON.parse(l)); document.getElementById('last-updated').textContent = 'Last updated: ' + new Date().toLocaleTimeString(); renderDashboard(); } catch(e) { // Show sample data if JSONL file is missing allData = generateSampleData(); renderDashboard(); } } function generateSampleData() { const users = ['dev-alice', 'team-backend', 'analyst-bob', 'pm-charlie']; const models = ['gpt-4', 'claude-3-sonnet', 'gemini-pro']; const categories = ['coding', 'analysis', 'translation', 'summary', 'writing']; const data = []; for (let i = 0; i < 50; i++) { const user = users[Math.floor(Math.random() * users.length)]; const daysAgo = Math.floor(Math.random() * 30); const ts = new Date(Date.now() - daysAgo * 86400000 - Math.random() * 86400000); data.push({ id: 'sample-' + i, timestamp: ts.toISOString(), model: models[Math.floor(Math.random() * models.length)], provider: 'openai', user_id: user, user_alias: user, prompt_category: categories[Math.floor(Math.random() * categories.length)], input_tokens: Math.floor(Math.random() * 2000) + 100, output_tokens: Math.floor(Math.random() * 1000) + 50, cost_usd: (Math.random() * 0.05).toFixed(4) * 1, latency_avg_ms: Math.floor(Math.random() * 1500) + 200, latency_p95_ms: Math.floor(Math.random() * 2500) + 500, successful: 1, total_requests: 1, is_dry_run: true, status_code: Math.random() > 0.05 ? 200 : 429, }); } return data; } function renderDashboard() { if (!allData.length) return; // Calculate KPIs const totalReqs = allData.reduce((s, r) => s + (r.total_requests || 1), 0); const totalSucc = allData.filter(r => r.status_code === 200).length; const successRate = ((totalSucc / allData.length) * 100).toFixed(1); const avgLatency = (allData.reduce((s, r) => s + (r.latency_avg_ms || 0), 0) / allData.length).toFixed(0); const p95Latency = (allData.reduce((s, r) => s + (r.latency_p95_ms || 0), 0) / allData.length).toFixed(0); const totalCost = allData.reduce((s, r) => s + (r.cost_usd || 0), 0).toFixed(4); // Update KPI cards document.getElementById('val-requests').textContent = totalReqs.toLocaleString(); document.getElementById('sub-requests').textContent = allData.length + ' run records'; document.getElementById('val-success').textContent = successRate + '%'; document.getElementById('sub-success').textContent = (allData.length - totalSucc) + ' failures'; const kpiSuccess = document.getElementById('kpi-success'); kpiSuccess.className = 'kpi-card ' + (successRate >= 95 ? 'ok' : successRate >= 90 ? 'warn' : 'danger'); document.getElementById('val-latency').textContent = p95Latency + 'ms'; document.getElementById('sub-latency').textContent = 'avg ' + avgLatency + 'ms'; const kpiLatency = document.getElementById('kpi-latency'); kpiLatency.className = 'kpi-card ' + (p95Latency < 1000 ? 'ok' : p95Latency < 2000 ? 'warn' : 'danger'); document.getElementById('val-cost').textContent = '$' + totalCost; document.getElementById('sub-cost').textContent = 'dry-run estimate'; // Trend chart renderTrendChart(); // Category distribution renderCategoryChart(); // User ranking renderRanking(); // Inactive users renderInactive(); // PM insights renderInsights(successRate, p95Latency, totalCost); } function renderTrendChart() { const ctx = document.getElementById('trend-chart').getContext('2d'); const byDate = {}; allData.forEach(r => { const d = r.timestamp.substring(0, 10); byDate[d] = (byDate[d] || 0) + (r.cost_usd || 0); }); const labels = Object.keys(byDate).sort().slice(-14); const values = labels.map(d => byDate[d].toFixed(4)); if (window._trendChart) window._trendChart.destroy(); window._trendChart = new Chart(ctx, { type: 'line', data: { labels, datasets: [{ label: 'Daily Cost ($)', data: values, borderColor: '#818cf8', backgroundColor: 'rgba(129,140,248,0.1)', fill: true, tension: 0.4, pointRadius: 4, pointBackgroundColor: '#818cf8', }] }, options: { plugins: { legend: { labels: { color: '#94a3b8' } } }, scales: { x: { ticks: { color: '#475569' }, grid: { color: 'rgba(255,255,255,0.04)' } }, y: { ticks: { color: '#475569' }, grid: { color: 'rgba(255,255,255,0.04)' } } } } }); } function renderCategoryChart() { const ctx = document.getElementById('category-chart').getContext('2d'); const cats = {}; allData.forEach(r => { cats[r.prompt_category || 'other'] = (cats[r.prompt_category || 'other'] || 0) + 1; }); const colors = ['#818cf8','#38bdf8','#34d399','#fb923c','#f472b6','#94a3b8']; if (window._catChart) window._catChart.destroy(); window._catChart = new Chart(ctx, { type: 'doughnut', data: { labels: Object.keys(cats), datasets: [{ data: Object.values(cats), backgroundColor: colors, borderWidth: 0 }] }, options: { plugins: { legend: { position: 'right', labels: { color: '#94a3b8', font: { size: 11 } } } }, cutout: '65%' } }); } function renderRanking(filter = '') { const userMap = {}; allData.forEach(r => { const uid = r.user_id || 'anonymous'; if (!userMap[uid]) userMap[uid] = { alias: r.user_alias || uid, cost: 0, runs: 0, models: {}, success: 0, last: r.timestamp }; userMap[uid].cost += r.cost_usd || 0; userMap[uid].runs += 1; userMap[uid].models[r.model] = (userMap[uid].models[r.model] || 0) + 1; if (r.status_code === 200) userMap[uid].success++; if (r.timestamp > userMap[uid].last) userMap[uid].last = r.timestamp; }); allUsers = userMap; const sorted = Object.entries(userMap) .filter(([uid, u]) => !filter || u.alias.toLowerCase().includes(filter.toLowerCase())) .sort((a, b) => b[1].cost - a[1].cost); const tbody = document.getElementById('ranking-body'); if (!sorted.length) { tbody.innerHTML = '<tr><td colspan="7" style="text-align:center;color:#475569;padding:16px;">No results found</td></tr>'; return; } const rankEmoji = ['🥇','🥈','🥉']; tbody.innerHTML = sorted.map(([uid, u], i) => { const topModel = Object.entries(u.models).sort((a,b) => b[1]-a[1])[0]?.[0] || '-'; const sr = ((u.success / u.runs) * 100).toFixed(1); const srClass = sr >= 95 ? 'badge-ok' : sr >= 90 ? 'badge-warn' : 'badge-danger'; const lastAgo = Math.floor((Date.now() - new Date(u.last)) / 86400000); const rankClass = i === 0 ? 'rank-1' : i === 1 ? 'rank-2' : i === 2 ? 'rank-3' : ''; return `<tr> <td class="${rankClass}">${rankEmoji[i] || (i+1)}</td> <td><a class="user-link" onclick="showUserDetail('${uid}')">${u.alias}</a></td> <td class="metric-value">$${u.cost.toFixed(4)}</td> <td class="metric-value">${u.runs.toLocaleString()}</td> <td><span style="font-size:0.75rem;color:#94a3b8;">${topModel}</span></td> <td><span class="badge ${srClass}">${sr}%</span></td> <td style="color:#475569;font-size:0.75rem;">${lastAgo === 0 ? 'Today' : lastAgo + 'd ago'}</td> </tr>`; }).join(''); } function filterRanking(val) { renderRanking(val); } function renderInactive() { const sevenDaysAgo = new Date(Date.now() - 7 * 86400000); const activeUsers = new Set( allData.filter(r => new Date(r.timestamp) > sevenDaysAgo).map(r => r.user_id) ); const lastSeen = {}; allData.forEach(r => { if (!lastSeen[r.user_id] || r.timestamp > lastSeen[r.user_id].ts) { lastSeen[r.user_id] = { ts: r.timestamp, alias: r.user_alias || r.user_id }; } }); const inactive = Object.entries(lastSeen).filter(([uid]) => !activeUsers.has(uid)); const tbody = document.getElementById('inactive-body'); if (!inactive.length) { tbody.innerHTML = '<tr><td colspan="4" style="text-align:center;color:#22c55e;padding:16px;">✅ All users active within 7 days</td></tr>'; return; } tbody.innerHTML = inactive.map(([uid, info]) => { const daysAgo = Math.floor((Date.now() - new Date(info.ts)) / 86400000); const cls = daysAgo >= 30 ? 'badge-danger' : daysAgo >= 14 ? 'badge-warn' : 'badge-ok'; return `<tr> <td><a class="user-link" onclick="showUserDetail('${uid}')">${info.alias}</a></td> <td class="metric-value">${daysAgo}d</td> <td style="color:#475569;font-size:0.75rem;">${new Date(info.ts).toLocaleDateString()}</td> <td><span class="badge ${cls}">${daysAgo >= 30 ? 'Critical' : daysAgo >= 14 ? 'Warning' : 'Monitor'}</span></td> </tr>`; }).join(''); } function renderInsights(successRate, p95Latency, totalCost) { const insights = []; const sevenDaysAgo = new Date(Date.now() - 7 * 86400000); const activeUsers = new Set(allData.filter(r => new Date(r.timestamp) > sevenDaysAgo).map(r => r.user_id)); const totalUsers = new Set(allData.map(r => r.user_id)).size; const adoptionRate = totalUsers ? Math.round(activeUsers.size / totalUsers * 100) : 0; const inactiveCount = totalUsers - activeUsers.size; if (inactiveCount > 0) insights.push(`■ <strong>${inactiveCount}</strong> inactive user(s) — consider onboarding/support`); if (successRate < 95) insights.push(`■ Success rate ${successRate}% → below SLA 95% — investigate error causes`); if (p95Latency > 2000) insights.push(`■ p95 latency ${p95Latency}ms → exceeds SLA — consider lighter models`); if (adoptionRate < 80) insights.push(`▲ Team adoption ${adoptionRate}% → below 80% target (${activeUsers.size}/${totalUsers} active)`); if (totalCost > 50) insights.push(`▲ Total cost $${totalCost} — review model optimization for top users`); const categories = {}; allData.forEach(r => { categories[r.prompt_category || 'other'] = (categories[r.prompt_category || 'other'] || 0) + 1; }); const topCat = Object.entries(categories).sort((a,b) => b[1]-a[1])[0]; if (topCat) insights.push(`● Top usage pattern: <strong>${topCat[0]}</strong> (${topCat[1]} times) — specialized model may improve efficiency`); const insightDiv = document.getElementById('pm-insights'); insightDiv.innerHTML = `<div class="insight-box"> <h4>💡 PM Auto Insights — as of ${new Date().toLocaleDateString()}</h4> <ul>${insights.map(i => `<li>${i}</li>`).join('')}</ul> </div>`; } function showUserDetail(userId) { const u = allUsers[userId]; if (!u) return; const userRuns = allData.filter(r => r.user_id === userId); const categories = {}; userRuns.forEach(r => { categories[r.prompt_category || 'other'] = (categories[r.prompt_category || 'other'] || 0) + 1; }); const totalCost = userRuns.reduce((s, r) => s + (r.cost_usd || 0), 0).toFixed(4); const topModel = Object.entries( userRuns.reduce((m, r) => { m[r.model] = (m[r.model] || 0)+1; return m; }, {}) ).sort((a,b) => b[1]-a[1])[0]?.[0] || '-'; document.getElementById('user-detail-content').innerHTML = ` <div style="background:var(--bg-elevated);border-radius:8px;padding:16px;margin-bottom:20px;"> <h2 style="font-size:1.25rem;margin-bottom:8px;">👤 ${u.alias}</h2> <div style="display:grid;grid-template-columns:repeat(4,1fr);gap:12px;margin-top:12px;"> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Total Cost</div><div class="metric-value" style="font-size:1.5rem;">$${totalCost}</div></div> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Total Requests</div><div class="metric-value" style="font-size:1.5rem;">${u.runs.toLocaleString()}</div></div> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Top Model</div><div style="font-size:1rem;margin-top:4px;">${topModel}</div></div> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Category Breakdown</div><div style="font-size:0.8rem;color:#94a3b8;">${Object.entries(categories).map(([k,v]) => k+' '+v+'x').join(', ')}</div></div> </div> </div> <h3 style="font-size:0.875rem;color:#94a3b8;margin-bottom:12px;">Recent Run Log</h3> <table class="ranking-table"> <thead><tr><th>Time</th><th>Model</th><th>Category</th><th>Cost</th><th>Latency</th><th>Status</th></tr></thead> <tbody> ${userRuns.slice(-10).reverse().map(r => { const sc = r.status_code === 200 ? 'badge-ok' : 'badge-danger'; return `<tr> <td style="color:#475569;font-size:0.75rem;">${new Date(r.timestamp).toLocaleString()}</td> <td style="font-size:0.8rem;">${r.model}</td> <td><span class="badge badge-ok" style="font-size:0.65rem;">${r.prompt_category||'other'}</span></td> <td class="metric-value">$${(r.cost_usd||0).toFixed(4)}</td> <td class="metric-value">${(r.latency_avg_ms||0).toFixed(0)}ms</td> <td><span class="badge ${sc}">${r.status_code||200}</span></td> </tr>`; }).join('')} </tbody> </table> <div class="insight-box" style="margin-top:16px;"> <h4>💡 Personal Insights</h4> <ul> <li>Top model: <strong>${topModel}</strong> — switching to a lighter model with similar performance could reduce costs</li> <li>Primary usage pattern: <strong>${Object.entries(categories).sort((a,b)=>b[1]-a[1])[0]?.[0]||'none'}</strong></li> <li>${u.runs} total runs — compare activity against team average</li> </ul> </div> `; document.getElementById('main-dashboard').style.display = 'none'; document.getElementById('user-detail').style.display = 'block'; window.scrollTo(0, 0); } function showMain() { document.getElementById('user-detail').style.display = 'none'; document.getElementById('main-dashboard').style.display = 'block'; } // Keyboard shortcuts document.addEventListener('keydown', e => { if (e.key === 'r' || e.key === 'R') loadData(); if (e.key === 'Escape') showMain(); }); // Initial load loadData(); // Auto-refresh every 5 minutes setInterval(loadData, 5 * 60 * 1000); </script> </body> </html> HTML_EOF
echo "✅ Lightweight HTML dashboard created: llm-monitoring/index.html"
mkdir -p llm-monitoring/data
cat > llm-monitoring/index.html << 'HTML_EOF'
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>🧮 LLM Usage Monitoring</title> <script src="https://cdn.jsdelivr.net/npm/chart.js@4/dist/chart.umd.min.js"></script> <link rel="preconnect" href="https://fonts.googleapis.com"> <link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght400;600&display=swap" rel="stylesheet"> <style> /* Design tokens */ :root { --bg-base: #0f1117; --bg-surface: #1a1d27; --bg-elevated: #21253a; --text-primary: #f1f5f9; --text-secondary: #94a3b8; --text-muted: #475569; --color-ok: #22c55e; --color-warn: #f59e0b; --color-danger: #ef4444; --series-1: #818cf8; --series-2: #38bdf8; --series-3: #34d399; --series-4: #fb923c; --rank-gold: #fbbf24; --rank-silver: #94a3b8; --rank-bronze: #b45309; --font-mono: 'JetBrains Mono', monospace; } * { box-sizing: border-box; margin: 0; padding: 0; } body { background: var(--bg-base); color: var(--text-primary); font-family: system-ui, sans-serif; padding: 24px; } header { display: flex; justify-content: space-between; align-items: center; margin-bottom: 32px; } header h1 { font-size: 1.5rem; font-weight: 700; color: #60a5fa; } .kpi-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 16px; margin-bottom: 24px; } @media (max-width: 768px) { .kpi-grid { grid-template-columns: repeat(2, 1fr); } } @media (max-width: 480px) { .kpi-grid { grid-template-columns: 1fr; } } .kpi-card { background: var(--bg-surface); border: 1px solid rgba(255,255,255,0.06); border-left: 3px solid var(--color-neutral, #60a5fa); border-radius: 12px; padding: 20px; } .kpi-card.ok { border-left-color: var(--color-ok); } .kpi-card.warn { border-left-color: var(--color-warn); } .kpi-card.danger { border-left-color: var(--color-danger); } .kpi-label { font-size: 0.625rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--text-muted); margin-bottom: 8px; } .kpi-value { font-family: var(--font-mono); font-size: 2rem; font-weight: 700; font-variant-numeric: tabular-nums; } .kpi-sub { font-size: 0.75rem; color: var(--text-secondary); margin-top: 4px; } .chart-row { display: grid; grid-template-columns: 2fr 1fr; gap: 16px; margin-bottom: 24px; } @media (max-width: 900px) { .chart-row { grid-template-columns: 1fr; } } .chart-card { background: var(--bg-surface); border: 1px solid rgba(255,255,255,0.06); border-radius: 12px; padding: 20px; } .chart-card h3 { font-size: 0.75rem; color: var(--text-secondary); margin-bottom: 16px; text-transform: uppercase; letter-spacing: 0.05em; } .ranking-table { width: 100%; border-collapse: collapse; } .ranking-table th { font-size: 0.625rem; text-transform: uppercase; color: var(--text-muted); padding: 8px 12px; text-align: left; border-bottom: 1px solid rgba(255,255,255,0.06); } .ranking-table td { padding: 12px; border-bottom: 1px solid rgba(255,255,255,0.04); font-family: var(--font-mono); font-size: 0.875rem; } .ranking-table tr:hover td { background: var(--bg-elevated); } .user-link { color: #60a5fa; text-decoration: none; cursor: pointer; } .user-link:hover { text-decoration: underline; } .badge { display: inline-block; padding: 2px 8px; border-radius: 4px; font-size: 0.7rem; } .badge-ok { background: rgba(34,197,94,0.1); color: var(--color-ok); } .badge-warn { background: rgba(245,158,11,0.1); color: var(--color-warn); } .badge-danger { background: rgba(239,68,68,0.1); color: var(--color-danger); } .rank-1 { color: var(--rank-gold); } .rank-2 { color: var(--rank-silver); } .rank-3 { color: var(--rank-bronze); } .insight-box { background: rgba(96,165,250,0.05); border: 1px solid rgba(96,165,250,0.15); border-radius: 8px; padding: 16px; margin-top: 8px; } .insight-box h4 { font-size: 0.75rem; color: #60a5fa; margin-bottom: 8px; } .insight-box ul { font-size: 0.8rem; color: var(--text-secondary); padding-left: 16px; } .insight-box ul li { margin-bottom: 4px; } .section-title { font-size: 1rem; font-weight: 600; margin: 24px 0 12px; } #user-detail { display: none; background: var(--bg-surface); border: 1px solid rgba(255,255,255,0.06); border-radius: 12px; padding: 24px; margin-top: 16px; } .back-btn { background: none; border: 1px solid rgba(255,255,255,0.1); color: var(--text-secondary); padding: 6px 12px; border-radius: 6px; cursor: pointer; font-size: 0.8rem; margin-bottom: 16px; } .back-btn:hover { background: var(--bg-elevated); } </style> </head> <body> <header> <div> <h1>🧮 LLM Usage Monitoring</h1> <p style="font-size:0.75rem;color:#475569;margin-top:4px;">Powered by Tokuin CLI</p> </div> <div style="display:flex;gap:8px;align-items:center;"> <span id="last-updated" style="font-size:0.75rem;color:#475569;"></span> <button onclick="loadData()" style="background:rgba(96,165,250,0.1);border:1px solid rgba(96,165,250,0.2);color:#60a5fa;padding:6px 14px;border-radius:6px;cursor:pointer;font-size:0.8rem;">↻ Refresh</button> </div> </header> <!-- Main dashboard --> <div id="main-dashboard"> <!-- 4 KPI cards --> <div class="kpi-grid"> <div class="kpi-card" id="kpi-requests"> <div class="kpi-label">Total Requests</div> <div class="kpi-value metric-value" id="val-requests">-</div> <div class="kpi-sub" id="sub-requests">Loading data...</div> </div> <div class="kpi-card" id="kpi-success"> <div class="kpi-label">Success Rate</div> <div class="kpi-value metric-value" id="val-success">-</div> <div class="kpi-sub" id="sub-success">-</div> </div> <div class="kpi-card" id="kpi-latency"> <div class="kpi-label">p95 Latency</div> <div class="kpi-value metric-value" id="val-latency">-</div> <div class="kpi-sub" id="sub-latency">-</div> </div> <div class="kpi-card" id="kpi-cost"> <div class="kpi-label">Total Cost</div> <div class="kpi-value metric-value" id="val-cost">-</div> <div class="kpi-sub" id="sub-cost">-</div> </div> </div>
<!-- Chart row -->
<div class="chart-row">
  <div class="chart-card">
    <h3>Cost Trend Over Time</h3>
    <canvas id="trend-chart" height="160"></canvas>
  </div>
  <div class="chart-card">
    <h3>Category Distribution</h3>
    <canvas id="category-chart" height="160"></canvas>
  </div>
</div>

<!-- User ranking -->
<h2 class="section-title">🏆 User Ranking</h2>
<div class="chart-card" style="margin-bottom:24px;">
  <div style="display:flex;justify-content:space-between;align-items:center;margin-bottom:12px;">
    <h3 style="margin-bottom:0;">Ranked by Cost</h3>
    <input id="user-search" type="text" placeholder="🔍 Search users..." 
      style="background:var(--bg-elevated);border:1px solid rgba(255,255,255,0.08);color:var(--text-primary);padding:6px 12px;border-radius:6px;font-size:0.8rem;width:200px;"
      oninput="filterRanking(this.value)">
  </div>
  <table class="ranking-table" id="ranking-table">
    <thead>
      <tr>
        <th>Rank</th>
        <th>User</th>
        <th>Cost</th>
        <th>Requests</th>
        <th>Top Model</th>
        <th>Success Rate</th>
        <th>Last Active</th>
      </tr>
    </thead>
    <tbody id="ranking-body">
      <tr><td colspan="7" style="text-align:center;color:#475569;padding:24px;">Loading data...</td></tr>
    </tbody>
  </table>
</div>

<!-- Inactive user tracking -->
<h2 class="section-title">💤 Inactive Users</h2>
<div class="chart-card" style="margin-bottom:24px;">
  <table class="ranking-table" id="inactive-table">
    <thead>
      <tr><th>User</th><th>Inactive For</th><th>Last Active</th><th>Status</th></tr>
    </thead>
    <tbody id="inactive-body">
      <tr><td colspan="4" style="text-align:center;color:#475569;padding:24px;">No tracking data</td></tr>
    </tbody>
  </table>
</div>

<!-- PM insights -->
<h2 class="section-title">📊 PM Auto Insights</h2>
<div id="pm-insights">
  <div class="insight-box">
    <h4>💡 Analyzing automatically...</h4>
  </div>
</div>
</div> <!-- Per-user detail page (shown on link click) --> <div id="user-detail"> <button class="back-btn" onclick="showMain()">← Back to Dashboard</button> <div id="user-detail-content"></div> </div> <script> let allData = []; let allUsers = {}; async function loadData() { try { const res = await fetch('./data/metrics.jsonl'); const text = await res.text(); allData = text.trim().split('\n').filter(Boolean).map(l => JSON.parse(l)); document.getElementById('last-updated').textContent = 'Last updated: ' + new Date().toLocaleTimeString(); renderDashboard(); } catch(e) { // Show sample data if JSONL file is missing allData = generateSampleData(); renderDashboard(); } } function generateSampleData() { const users = ['dev-alice', 'team-backend', 'analyst-bob', 'pm-charlie']; const models = ['gpt-4', 'claude-3-sonnet', 'gemini-pro']; const categories = ['coding', 'analysis', 'translation', 'summary', 'writing']; const data = []; for (let i = 0; i < 50; i++) { const user = users[Math.floor(Math.random() * users.length)]; const daysAgo = Math.floor(Math.random() * 30); const ts = new Date(Date.now() - daysAgo * 86400000 - Math.random() * 86400000); data.push({ id: 'sample-' + i, timestamp: ts.toISOString(), model: models[Math.floor(Math.random() * models.length)], provider: 'openai', user_id: user, user_alias: user, prompt_category: categories[Math.floor(Math.random() * categories.length)], input_tokens: Math.floor(Math.random() * 2000) + 100, output_tokens: Math.floor(Math.random() * 1000) + 50, cost_usd: (Math.random() * 0.05).toFixed(4) * 1, latency_avg_ms: Math.floor(Math.random() * 1500) + 200, latency_p95_ms: Math.floor(Math.random() * 2500) + 500, successful: 1, total_requests: 1, is_dry_run: true, status_code: Math.random() > 0.05 ? 200 : 429, }); } return data; } function renderDashboard() { if (!allData.length) return; // Calculate KPIs const totalReqs = allData.reduce((s, r) => s + (r.total_requests || 1), 0); const totalSucc = allData.filter(r => r.status_code === 200).length; const successRate = ((totalSucc / allData.length) * 100).toFixed(1); const avgLatency = (allData.reduce((s, r) => s + (r.latency_avg_ms || 0), 0) / allData.length).toFixed(0); const p95Latency = (allData.reduce((s, r) => s + (r.latency_p95_ms || 0), 0) / allData.length).toFixed(0); const totalCost = allData.reduce((s, r) => s + (r.cost_usd || 0), 0).toFixed(4); // Update KPI cards document.getElementById('val-requests').textContent = totalReqs.toLocaleString(); document.getElementById('sub-requests').textContent = allData.length + ' run records'; document.getElementById('val-success').textContent = successRate + '%'; document.getElementById('sub-success').textContent = (allData.length - totalSucc) + ' failures'; const kpiSuccess = document.getElementById('kpi-success'); kpiSuccess.className = 'kpi-card ' + (successRate >= 95 ? 'ok' : successRate >= 90 ? 'warn' : 'danger'); document.getElementById('val-latency').textContent = p95Latency + 'ms'; document.getElementById('sub-latency').textContent = 'avg ' + avgLatency + 'ms'; const kpiLatency = document.getElementById('kpi-latency'); kpiLatency.className = 'kpi-card ' + (p95Latency < 1000 ? 'ok' : p95Latency < 2000 ? 'warn' : 'danger'); document.getElementById('val-cost').textContent = '$' + totalCost; document.getElementById('sub-cost').textContent = 'dry-run estimate'; // Trend chart renderTrendChart(); // Category distribution renderCategoryChart(); // User ranking renderRanking(); // Inactive users renderInactive(); // PM insights renderInsights(successRate, p95Latency, totalCost); } function renderTrendChart() { const ctx = document.getElementById('trend-chart').getContext('2d'); const byDate = {}; allData.forEach(r => { const d = r.timestamp.substring(0, 10); byDate[d] = (byDate[d] || 0) + (r.cost_usd || 0); }); const labels = Object.keys(byDate).sort().slice(-14); const values = labels.map(d => byDate[d].toFixed(4)); if (window._trendChart) window._trendChart.destroy(); window._trendChart = new Chart(ctx, { type: 'line', data: { labels, datasets: [{ label: 'Daily Cost ($)', data: values, borderColor: '#818cf8', backgroundColor: 'rgba(129,140,248,0.1)', fill: true, tension: 0.4, pointRadius: 4, pointBackgroundColor: '#818cf8', }] }, options: { plugins: { legend: { labels: { color: '#94a3b8' } } }, scales: { x: { ticks: { color: '#475569' }, grid: { color: 'rgba(255,255,255,0.04)' } }, y: { ticks: { color: '#475569' }, grid: { color: 'rgba(255,255,255,0.04)' } } } } }); } function renderCategoryChart() { const ctx = document.getElementById('category-chart').getContext('2d'); const cats = {}; allData.forEach(r => { cats[r.prompt_category || 'other'] = (cats[r.prompt_category || 'other'] || 0) + 1; }); const colors = ['#818cf8','#38bdf8','#34d399','#fb923c','#f472b6','#94a3b8']; if (window._catChart) window._catChart.destroy(); window._catChart = new Chart(ctx, { type: 'doughnut', data: { labels: Object.keys(cats), datasets: [{ data: Object.values(cats), backgroundColor: colors, borderWidth: 0 }] }, options: { plugins: { legend: { position: 'right', labels: { color: '#94a3b8', font: { size: 11 } } } }, cutout: '65%' } }); } function renderRanking(filter = '') { const userMap = {}; allData.forEach(r => { const uid = r.user_id || 'anonymous'; if (!userMap[uid]) userMap[uid] = { alias: r.user_alias || uid, cost: 0, runs: 0, models: {}, success: 0, last: r.timestamp }; userMap[uid].cost += r.cost_usd || 0; userMap[uid].runs += 1; userMap[uid].models[r.model] = (userMap[uid].models[r.model] || 0) + 1; if (r.status_code === 200) userMap[uid].success++; if (r.timestamp > userMap[uid].last) userMap[uid].last = r.timestamp; }); allUsers = userMap; const sorted = Object.entries(userMap) .filter(([uid, u]) => !filter || u.alias.toLowerCase().includes(filter.toLowerCase())) .sort((a, b) => b[1].cost - a[1].cost); const tbody = document.getElementById('ranking-body'); if (!sorted.length) { tbody.innerHTML = '<tr><td colspan="7" style="text-align:center;color:#475569;padding:16px;">No results found</td></tr>'; return; } const rankEmoji = ['🥇','🥈','🥉']; tbody.innerHTML = sorted.map(([uid, u], i) => { const topModel = Object.entries(u.models).sort((a,b) => b[1]-a[1])[0]?.[0] || '-'; const sr = ((u.success / u.runs) * 100).toFixed(1); const srClass = sr >= 95 ? 'badge-ok' : sr >= 90 ? 'badge-warn' : 'badge-danger'; const lastAgo = Math.floor((Date.now() - new Date(u.last)) / 86400000); const rankClass = i === 0 ? 'rank-1' : i === 1 ? 'rank-2' : i === 2 ? 'rank-3' : ''; return `<tr> <td class="${rankClass}">${rankEmoji[i] || (i+1)}</td> <td><a class="user-link" onclick="showUserDetail('${uid}')">${u.alias}</a></td> <td class="metric-value">$${u.cost.toFixed(4)}</td> <td class="metric-value">${u.runs.toLocaleString()}</td> <td><span style="font-size:0.75rem;color:#94a3b8;">${topModel}</span></td> <td><span class="badge ${srClass}">${sr}%</span></td> <td style="color:#475569;font-size:0.75rem;">${lastAgo === 0 ? 'Today' : lastAgo + 'd ago'}</td> </tr>`; }).join(''); } function filterRanking(val) { renderRanking(val); } function renderInactive() { const sevenDaysAgo = new Date(Date.now() - 7 * 86400000); const activeUsers = new Set( allData.filter(r => new Date(r.timestamp) > sevenDaysAgo).map(r => r.user_id) ); const lastSeen = {}; allData.forEach(r => { if (!lastSeen[r.user_id] || r.timestamp > lastSeen[r.user_id].ts) { lastSeen[r.user_id] = { ts: r.timestamp, alias: r.user_alias || r.user_id }; } }); const inactive = Object.entries(lastSeen).filter(([uid]) => !activeUsers.has(uid)); const tbody = document.getElementById('inactive-body'); if (!inactive.length) { tbody.innerHTML = '<tr><td colspan="4" style="text-align:center;color:#22c55e;padding:16px;">✅ All users active within 7 days</td></tr>'; return; } tbody.innerHTML = inactive.map(([uid, info]) => { const daysAgo = Math.floor((Date.now() - new Date(info.ts)) / 86400000); const cls = daysAgo >= 30 ? 'badge-danger' : daysAgo >= 14 ? 'badge-warn' : 'badge-ok'; return `<tr> <td><a class="user-link" onclick="showUserDetail('${uid}')">${info.alias}</a></td> <td class="metric-value">${daysAgo}d</td> <td style="color:#475569;font-size:0.75rem;">${new Date(info.ts).toLocaleDateString()}</td> <td><span class="badge ${cls}">${daysAgo >= 30 ? 'Critical' : daysAgo >= 14 ? 'Warning' : 'Monitor'}</span></td> </tr>`; }).join(''); } function renderInsights(successRate, p95Latency, totalCost) { const insights = []; const sevenDaysAgo = new Date(Date.now() - 7 * 86400000); const activeUsers = new Set(allData.filter(r => new Date(r.timestamp) > sevenDaysAgo).map(r => r.user_id)); const totalUsers = new Set(allData.map(r => r.user_id)).size; const adoptionRate = totalUsers ? Math.round(activeUsers.size / totalUsers * 100) : 0; const inactiveCount = totalUsers - activeUsers.size; if (inactiveCount > 0) insights.push(`■ <strong>${inactiveCount}</strong> inactive user(s) — consider onboarding/support`); if (successRate < 95) insights.push(`■ Success rate ${successRate}% → below SLA 95% — investigate error causes`); if (p95Latency > 2000) insights.push(`■ p95 latency ${p95Latency}ms → exceeds SLA — consider lighter models`); if (adoptionRate < 80) insights.push(`▲ Team adoption ${adoptionRate}% → below 80% target (${activeUsers.size}/${totalUsers} active)`); if (totalCost > 50) insights.push(`▲ Total cost $${totalCost} — review model optimization for top users`); const categories = {}; allData.forEach(r => { categories[r.prompt_category || 'other'] = (categories[r.prompt_category || 'other'] || 0) + 1; }); const topCat = Object.entries(categories).sort((a,b) => b[1]-a[1])[0]; if (topCat) insights.push(`● Top usage pattern: <strong>${topCat[0]}</strong> (${topCat[1]} times) — specialized model may improve efficiency`); const insightDiv = document.getElementById('pm-insights'); insightDiv.innerHTML = `<div class="insight-box"> <h4>💡 PM Auto Insights — as of ${new Date().toLocaleDateString()}</h4> <ul>${insights.map(i => `<li>${i}</li>`).join('')}</ul> </div>`; } function showUserDetail(userId) { const u = allUsers[userId]; if (!u) return; const userRuns = allData.filter(r => r.user_id === userId); const categories = {}; userRuns.forEach(r => { categories[r.prompt_category || 'other'] = (categories[r.prompt_category || 'other'] || 0) + 1; }); const totalCost = userRuns.reduce((s, r) => s + (r.cost_usd || 0), 0).toFixed(4); const topModel = Object.entries( userRuns.reduce((m, r) => { m[r.model] = (m[r.model] || 0)+1; return m; }, {}) ).sort((a,b) => b[1]-a[1])[0]?.[0] || '-'; document.getElementById('user-detail-content').innerHTML = ` <div style="background:var(--bg-elevated);border-radius:8px;padding:16px;margin-bottom:20px;"> <h2 style="font-size:1.25rem;margin-bottom:8px;">👤 ${u.alias}</h2> <div style="display:grid;grid-template-columns:repeat(4,1fr);gap:12px;margin-top:12px;"> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Total Cost</div><div class="metric-value" style="font-size:1.5rem;">$${totalCost}</div></div> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Total Requests</div><div class="metric-value" style="font-size:1.5rem;">${u.runs.toLocaleString()}</div></div> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Top Model</div><div style="font-size:1rem;margin-top:4px;">${topModel}</div></div> <div><div style="font-size:0.625rem;color:#475569;text-transform:uppercase;margin-bottom:4px;">Category Breakdown</div><div style="font-size:0.8rem;color:#94a3b8;">${Object.entries(categories).map(([k,v]) => k+' '+v+'x').join(', ')}</div></div> </div> </div> <h3 style="font-size:0.875rem;color:#94a3b8;margin-bottom:12px;">Recent Run Log</h3> <table class="ranking-table"> <thead><tr><th>Time</th><th>Model</th><th>Category</th><th>Cost</th><th>Latency</th><th>Status</th></tr></thead> <tbody> ${userRuns.slice(-10).reverse().map(r => { const sc = r.status_code === 200 ? 'badge-ok' : 'badge-danger'; return `<tr> <td style="color:#475569;font-size:0.75rem;">${new Date(r.timestamp).toLocaleString()}</td> <td style="font-size:0.8rem;">${r.model}</td> <td><span class="badge badge-ok" style="font-size:0.65rem;">${r.prompt_category||'other'}</span></td> <td class="metric-value">$${(r.cost_usd||0).toFixed(4)}</td> <td class="metric-value">${(r.latency_avg_ms||0).toFixed(0)}ms</td> <td><span class="badge ${sc}">${r.status_code||200}</span></td> </tr>`; }).join('')} </tbody> </table> <div class="insight-box" style="margin-top:16px;"> <h4>💡 Personal Insights</h4> <ul> <li>Top model: <strong>${topModel}</strong> — switching to a lighter model with similar performance could reduce costs</li> <li>Primary usage pattern: <strong>${Object.entries(categories).sort((a,b)=>b[1]-a[1])[0]?.[0]||'none'}</strong></li> <li>${u.runs} total runs — compare activity against team average</li> </ul> </div> `; document.getElementById('main-dashboard').style.display = 'none'; document.getElementById('user-detail').style.display = 'block'; window.scrollTo(0, 0); } function showMain() { document.getElementById('user-detail').style.display = 'none'; document.getElementById('main-dashboard').style.display = 'block'; } // Keyboard shortcuts document.addEventListener('keydown', e => { if (e.key === 'r' || e.key === 'R') loadData(); if (e.key === 'Escape') showMain(); }); // Initial load loadData(); // Auto-refresh every 5 minutes setInterval(loadData, 5 * 60 * 1000); </script> </body> </html> HTML_EOF
echo "✅ Lightweight HTML dashboard created: llm-monitoring/index.html"

Start local server

Start local server

cd llm-monitoring && python3 -m http.server "${DASHBOARD_PORT:-3000}" & echo "✅ Dashboard running: http://localhost:${DASHBOARD_PORT:-3000}"

---
cd llm-monitoring && python3 -m http.server "${DASHBOARD_PORT:-3000}" & echo "✅ Dashboard running: http://localhost:${DASHBOARD_PORT:-3000}"

---

Step 4: PM insights tab and ranking system

步骤4:PM洞察标签页和排名系统

(For Option A / Next.js)
bash
undefined
(适用于选项A / Next.js)
bash
undefined

Create PM dashboard API route

Create PM dashboard API route

cat > app/api/ranking/route.ts << 'TS_EOF' import { NextRequest, NextResponse } from 'next/server' import db from '@/lib/llm-monitoring/db'
export async function GET(req: NextRequest) { const period = req.nextUrl.searchParams.get('period') || '30d' const days = period === '7d' ? 7 : period === '90d' ? 90 : 30
// Cost-based ranking const costRanking = db.prepare(
    SELECT       user_id, user_alias,       ROUND(SUM(cost_usd), 4)           AS total_cost,       COUNT(*)                           AS total_runs,       GROUP_CONCAT(DISTINCT model)       AS models_used,       ROUND(AVG(latency_avg_ms), 0)      AS avg_latency,       ROUND(         AVG(CAST(successful AS REAL) / NULLIF(total_requests, 0)) * 100, 1       )                                  AS success_rate,       MAX(timestamp)                     AS last_seen     FROM runs     WHERE timestamp >= datetime('now', '-' || ? || ' days')     GROUP BY user_id     ORDER BY total_cost DESC     LIMIT 20  
).all(days)
// Inactive user tracking (registered users with no activity in the selected period) const inactiveUsers = db.prepare(
    SELECT       p.user_id, p.user_alias, p.team,       MAX(r.timestamp)  AS last_seen,       CAST((julianday('now') - julianday(MAX(r.timestamp))) AS INTEGER) AS days_inactive     FROM user_profiles p     LEFT JOIN runs r ON p.user_id = r.user_id     GROUP BY p.user_id     HAVING last_seen IS NULL        OR days_inactive >= 7     ORDER BY days_inactive DESC  
).all()
// PM summary const summary = db.prepare(
    SELECT       COUNT(DISTINCT user_id)    AS total_users,       COUNT(DISTINCT CASE WHEN timestamp >= datetime('now', '-7 days') THEN user_id END) AS active_7d,       ROUND(SUM(cost_usd), 2)    AS total_cost,       COUNT(*)                   AS total_runs     FROM runs     WHERE timestamp >= datetime('now', '-' || ? || ' days')  
).get(days) as Record<string, number>
return NextResponse.json({ costRanking, inactiveUsers, summary }) } TS_EOF

---
cat > app/api/ranking/route.ts << 'TS_EOF' import { NextRequest, NextResponse } from 'next/server' import db from '@/lib/llm-monitoring/db'
export async function GET(req: NextRequest) { const period = req.nextUrl.searchParams.get('period') || '30d' const days = period === '7d' ? 7 : period === '90d' ? 90 : 30
// Cost-based ranking const costRanking = db.prepare(
    SELECT       user_id, user_alias,       ROUND(SUM(cost_usd), 4)           AS total_cost,       COUNT(*)                           AS total_runs,       GROUP_CONCAT(DISTINCT model)       AS models_used,       ROUND(AVG(latency_avg_ms), 0)      AS avg_latency,       ROUND(         AVG(CAST(successful AS REAL) / NULLIF(total_requests, 0)) * 100, 1       )                                  AS success_rate,       MAX(timestamp)                     AS last_seen     FROM runs     WHERE timestamp >= datetime('now', '-' || ? || ' days')     GROUP BY user_id     ORDER BY total_cost DESC     LIMIT 20  
).all(days)
// Inactive user tracking (registered users with no activity in the selected period) const inactiveUsers = db.prepare(
    SELECT       p.user_id, p.user_alias, p.team,       MAX(r.timestamp)  AS last_seen,       CAST((julianday('now') - julianday(MAX(r.timestamp))) AS INTEGER) AS days_inactive     FROM user_profiles p     LEFT JOIN runs r ON p.user_id = r.user_id     GROUP BY p.user_id     HAVING last_seen IS NULL        OR days_inactive >= 7     ORDER BY days_inactive DESC  
).all()
// PM summary const summary = db.prepare(
    SELECT       COUNT(DISTINCT user_id)    AS total_users,       COUNT(DISTINCT CASE WHEN timestamp >= datetime('now', '-7 days') THEN user_id END) AS active_7d,       ROUND(SUM(cost_usd), 2)    AS total_cost,       COUNT(*)                   AS total_runs     FROM runs     WHERE timestamp >= datetime('now', '-' || ? || ' days')  
).get(days) as Record<string, number>
return NextResponse.json({ costRanking, inactiveUsers, summary }) } TS_EOF

---

Step 5: Auto-generate weekly PM report

步骤5:自动生成PM周报

bash
cat > generate-pm-report.sh << 'REPORT_EOF'
#!/usr/bin/env bash
bash
cat > generate-pm-report.sh << 'REPORT_EOF'
#!/usr/bin/env bash

generate-pm-report.sh — Auto-generate weekly PM report (Markdown)

generate-pm-report.sh — Auto-generate weekly PM report (Markdown)

set -euo pipefail
REPORT_DATE=$(date +"%Y-%m-%d") REPORT_WEEK=$(date +"%Y-W%V") OUTPUT_DIR="./reports" OUTPUT="${OUTPUT_DIR}/pm-weekly-${REPORT_DATE}.md" mkdir -p "$OUTPUT_DIR"
python3 << PYEOF > "$OUTPUT" import json, sys from datetime import datetime, timedelta from collections import defaultdict
set -euo pipefail
REPORT_DATE=$(date +"%Y-%m-%d") REPORT_WEEK=$(date +"%Y-W%V") OUTPUT_DIR="./reports" OUTPUT="${OUTPUT_DIR}/pm-weekly-${REPORT_DATE}.md" mkdir -p "$OUTPUT_DIR"
python3 << PYEOF > "$OUTPUT" import json, sys from datetime import datetime, timedelta from collections import defaultdict

Load data from the last 7 days

Load data from the last 7 days

try: records = [json.loads(l) for l in open('./data/metrics.jsonl') if l.strip()] except FileNotFoundError: records = []
week_ago = (datetime.now() - timedelta(days=7)).isoformat() week_data = [r for r in records if r.get('timestamp', '') >= week_ago]
try: records = [json.loads(l) for l in open('./data/metrics.jsonl') if l.strip()] except FileNotFoundError: records = []
week_ago = (datetime.now() - timedelta(days=7)).isoformat() week_data = [r for r in records if r.get('timestamp', '') >= week_ago]

Aggregate

Aggregate

total_cost = sum(r.get('cost_usd', 0) for r in week_data) total_runs = len(week_data) active_users = set(r['user_id'] for r in week_data) all_users = set(r['user_id'] for r in records) inactive_users = all_users - active_users
total_cost = sum(r.get('cost_usd', 0) for r in week_data) total_runs = len(week_data) active_users = set(r['user_id'] for r in week_data) all_users = set(r['user_id'] for r in records) inactive_users = all_users - active_users

Per-user cost ranking

Per-user cost ranking

user_costs = defaultdict(lambda: {'cost': 0, 'runs': 0, 'alias': '', 'categories': defaultdict(int)}) for r in week_data: uid = r.get('user_id', 'unknown') user_costs[uid]['cost'] += r.get('cost_usd', 0) user_costs[uid]['runs'] += 1 user_costs[uid]['alias'] = r.get('user_alias', uid) user_costs[uid]['categories'][r.get('prompt_category', 'other')] += 1
top_users = sorted(user_costs.items(), key=lambda x: x[1]['cost'], reverse=True)[:5]
user_costs = defaultdict(lambda: {'cost': 0, 'runs': 0, 'alias': '', 'categories': defaultdict(int)}) for r in week_data: uid = r.get('user_id', 'unknown') user_costs[uid]['cost'] += r.get('cost_usd', 0) user_costs[uid]['runs'] += 1 user_costs[uid]['alias'] = r.get('user_alias', uid) user_costs[uid]['categories'][r.get('prompt_category', 'other')] += 1
top_users = sorted(user_costs.items(), key=lambda x: x[1]['cost'], reverse=True)[:5]

Model usage

Model usage

model_usage = defaultdict(int) for r in week_data: model_usage[r.get('model', 'unknown')] += 1 top_model = max(model_usage, key=model_usage.get) if model_usage else '-'
model_usage = defaultdict(int) for r in week_data: model_usage[r.get('model', 'unknown')] += 1 top_model = max(model_usage, key=model_usage.get) if model_usage else '-'

Success rate

Success rate

success_count = sum(1 for r in week_data if r.get('status_code', 200) == 200) success_rate = (success_count / total_runs * 100) if total_runs else 0
print(f"""# 📊 LLM Usage Weekly Report — {REPORT_DATE} ({REPORT_WEEK})
success_count = sum(1 for r in week_data if r.get('status_code', 200) == 200) success_rate = (success_count / total_runs * 100) if total_runs else 0
print(f"""# 📊 LLM Usage Weekly Report — {REPORT_DATE} ({REPORT_WEEK})

Executive Summary

Executive Summary

MetricValue
Total Cost${total_cost:.2f}
Total Runs{total_runs:,}
Active Users{len(active_users)}
Adoption Rate{len(active_users)}/{len(all_users)} ({len(active_users)/len(all_users)*100:.0f}% if all_users else 'N/A')
Success Rate{success_rate:.1f}%
Top Model{top_model}
MetricValue
Total Cost${total_cost:.2f}
Total Runs{total_runs:,}
Active Users{len(active_users)}
Adoption Rate{len(active_users)}/{len(all_users)} ({len(active_users)/len(all_users)*100:.0f}% if all_users else 'N/A')
Success Rate{success_rate:.1f}%
Top Model{top_model}

🏆 Top 5 Users (by Cost)

🏆 Top 5 Users (by Cost)

RankUserCostRunsTop Category
{"".join(f"{'🥇🥈🥉'[i] if i < 3 else i+1}{u['alias']}${u['cost']:.4f}{u['runs']}
RankUserCostRunsTop Category
{"".join(f"{'🥇🥈🥉'[i] if i < 3 else i+1}{u['alias']}${u['cost']:.4f}{u['runs']}

💤 Inactive Users ({len(inactive_users)})

💤 Inactive Users ({len(inactive_users)})

{"None — all users active within 7 days" if not inactive_users else chr(10).join(f"- {uid}" for uid in inactive_users)}
{"None — all users active within 7 days" if not inactive_users else chr(10).join(f"- {uid}" for uid in inactive_users)}

💡 PM Recommended Actions

💡 PM Recommended Actions

{"- " + str(len(inactive_users)) + " inactive user(s) — consider onboarding/support" if inactive_users else ""} {"- Success rate " + f"{success_rate:.1f}%" + " — SLA 95% " + ("achieved ✅" if success_rate >= 95 else "not met ⚠️ investigate error causes") } {"- Total cost $" + f"{total_cost:.2f}" + " — review model optimization opportunities vs. prior week"}

Auto-generated by generate-pm-report.sh | Powered by Tokuin CLI """) PYEOF
echo "✅ PM report generated: $OUTPUT" cat "$OUTPUT"
{"- " + str(len(inactive_users)) + " inactive user(s) — consider onboarding/support" if inactive_users else ""} {"- Success rate " + f"{success_rate:.1f}%" + " — SLA 95% " + ("achieved ✅" if success_rate >= 95 else "not met ⚠️ investigate error causes") } {"- Total cost $" + f"{total_cost:.2f}" + " — review model optimization opportunities vs. prior week"}

Auto-generated by generate-pm-report.sh | Powered by Tokuin CLI """) PYEOF
echo "✅ PM report generated: $OUTPUT" cat "$OUTPUT"

Slack notification (if configured)

Slack notification (if configured)

if [ -n "${SLACK_WEBHOOK_URL:-}" ]; then SUMMARY=$(grep -A5 "## Executive Summary" "$OUTPUT" | tail -5) curl -s -X POST "$SLACK_WEBHOOK_URL"
-H 'Content-type: application/json'
-d "{"text":"📊 Weekly LLM Report ($REPORT_DATE)\n$SUMMARY"}" > /dev/null echo "✅ Slack notification sent" fi REPORT_EOF chmod +x generate-pm-report.sh
if [ -n "${SLACK_WEBHOOK_URL:-}" ]; then SUMMARY=$(grep -A5 "## Executive Summary" "$OUTPUT" | tail -5) curl -s -X POST "$SLACK_WEBHOOK_URL"
-H 'Content-type: application/json'
-d "{"text":"📊 Weekly LLM Report ($REPORT_DATE)\n$SUMMARY"}" > /dev/null echo "✅ Slack notification sent" fi REPORT_EOF chmod +x generate-pm-report.sh

Schedule to run every Monday at 9am

Schedule to run every Monday at 9am

(crontab -l 2>/dev/null; echo "0 9 * * 1 cd $(pwd) && bash generate-pm-report.sh >> ./data/report.log 2>&1") | crontab - echo "✅ Weekly report cron registered (every Monday 09:00)"
(crontab -l 2>/dev/null; echo "0 9 * * 1 cd $(pwd) && bash generate-pm-report.sh >> ./data/report.log 2>&1") | crontab - echo "✅ Weekly report cron registered (every Monday 09:00)"

Run immediately for testing

Run immediately for testing

bash generate-pm-report.sh

---
bash generate-pm-report.sh

---

Step 6: Cost alert setup

步骤6:成本告警设置

bash
cat > check-alerts.sh << 'ALERT_EOF'
#!/usr/bin/env bash
bash
cat > check-alerts.sh << 'ALERT_EOF'
#!/usr/bin/env bash

check-alerts.sh — Detect cost threshold breaches and send Slack alerts

check-alerts.sh — Detect cost threshold breaches and send Slack alerts

set -euo pipefail
THRESHOLD="${COST_THRESHOLD_USD:-10.00}"
CURRENT_COST=$(python3 << PYEOF import json from datetime import datetime, timedelta
today = datetime.now().date().isoformat() try: records = [json.loads(l) for l in open('./data/metrics.jsonl') if l.strip()] today_cost = sum(r.get('cost_usd', 0) for r in records if r.get('timestamp', '')[:10] == today) print(f"{today_cost:.4f}") except: print("0.0000") PYEOF )
python3 - << PYEOF import sys cost, threshold = float('$CURRENT_COST'), float('$THRESHOLD') if cost > threshold: print(f"ALERT: Today's cost ${cost:.4f} has exceeded the threshold ${threshold:.2f}!") sys.exit(1) else: print(f"OK: Today's cost ${cost:.4f} / threshold ${threshold:.2f}") sys.exit(0) PYEOF
set -euo pipefail
THRESHOLD="${COST_THRESHOLD_USD:-10.00}"
CURRENT_COST=$(python3 << PYEOF import json from datetime import datetime, timedelta
today = datetime.now().date().isoformat() try: records = [json.loads(l) for l in open('./data/metrics.jsonl') if l.strip()] today_cost = sum(r.get('cost_usd', 0) for r in records if r.get('timestamp', '')[:10] == today) print(f"{today_cost:.4f}") except: print("0.0000") PYEOF )
python3 - << PYEOF import sys cost, threshold = float('$CURRENT_COST'), float('$THRESHOLD') if cost > threshold: print(f"ALERT: Today's cost ${cost:.4f} has exceeded the threshold ${threshold:.2f}!") sys.exit(1) else: print(f"OK: Today's cost ${cost:.4f} / threshold ${threshold:.2f}") sys.exit(0) PYEOF

Send Slack alert on exit 1

Send Slack alert on exit 1

if [ $? -ne 0 ] && [ -n "${SLACK_WEBHOOK_URL:-}" ]; then curl -s -X POST "$SLACK_WEBHOOK_URL"
-H 'Content-type: application/json'
-d "{"text":"⚠️ LLM cost threshold exceeded!\nToday's cost: $$CURRENT_COST / Threshold: $$THRESHOLD"}" > /dev/null fi ALERT_EOF chmod +x check-alerts.sh
if [ $? -ne 0 ] && [ -n "${SLACK_WEBHOOK_URL:-}" ]; then curl -s -X POST "$SLACK_WEBHOOK_URL"
-H 'Content-type: application/json'
-d "{"text":"⚠️ LLM cost threshold exceeded!\nToday's cost: $$CURRENT_COST / Threshold: $$THRESHOLD"}" > /dev/null fi ALERT_EOF chmod +x check-alerts.sh

Check cost every hour

Check cost every hour

(crontab -l 2>/dev/null; echo "0 * * * * cd $(pwd) && bash check-alerts.sh >> ./data/alerts.log 2>&1") | crontab - echo "✅ Cost alert cron registered (every hour)"

---
(crontab -l 2>/dev/null; echo "0 * * * * cd $(pwd) && bash check-alerts.sh >> ./data/alerts.log 2>&1") | crontab - echo "✅ Cost alert cron registered (every hour)"

---

Privacy Policy

隐私政策

yaml
undefined
yaml
undefined

Privacy policy (must be followed)

Privacy policy (must be followed)

prompt_storage: store_full_prompt: false # Default: do not store raw prompt text store_preview: false # Storing first 100 chars also disabled by default (requires explicit admin config) store_hash: true # Store SHA-256 hash only (for pattern analysis)
user_data: anonymize_by_default: true # user_id can be stored as a hash (controlled via LLM_USER_ID env var) retention_days: 90 # Recommend purging data older than 90 days
compliance:

Never log API keys in code, HTML, scripts, or log files.

Always add .env to .gitignore.

Restrict prompt preview access to admins only.


> ⚠️ **Required steps when enabling `store_preview: true`**
> 
> Prompt preview storage can only be enabled after an **admin explicitly** completes the following steps:
> 
> 1. Set `STORE_PREVIEW=true` in the `.env` file (do not modify code directly)
> 2. Obtain team consent for personal data processing (notify users that previews will be stored)
> 3. Restrict access to **admin role only** (regular users must not be able to view)
> 4. Set `retention_days` explicitly to define the retention period
> 
> Enabling `store_preview: true` without completing these steps is a **MUST NOT** violation.

---
prompt_storage: store_full_prompt: false # Default: do not store raw prompt text store_preview: false # Storing first 100 chars also disabled by default (requires explicit admin config) store_hash: true # Store SHA-256 hash only (for pattern analysis)
user_data: anonymize_by_default: true # user_id can be stored as a hash (controlled via LLM_USER_ID env var) retention_days: 90 # Recommend purging data older than 90 days
compliance:

Never log API keys in code, HTML, scripts, or log files.

Always add .env to .gitignore.

Restrict prompt preview access to admins only.


> ⚠️ **启用`store_preview: true`时的必填步骤**
> 
> 只有当管理员明确完成以下步骤后,才能启用Prompt预览存储功能:
> 
> 1. 在`.env`文件中设置`STORE_PREVIEW=true`(不要直接修改代码)
> 2. 获得团队对个人数据处理的同意(告知用户预览内容将被存储)
> 3. 限制仅管理员角色可访问(普通用户不得查看)
> 4. 明确设置`retention_days`定义数据保留周期
> 
> 未完成以上步骤就启用`store_preview: true`属于严重违规行为。

---

Output Format

输出格式

Files generated after running the skill:
./
├── safety-guard.sh          # Safety gate (Step 0)
├── categorize_prompt.py     # Prompt auto-categorization
├── collect-metrics.sh       # Metrics collection (Step 2)
├── generate-pm-report.sh    # PM weekly report (Step 5)
├── check-alerts.sh          # Cost alerts (Step 6)
├── data/
│   ├── metrics.jsonl        # Time-series metrics (JSONL format)
│   ├── collect.log          # Collection log
│   ├── alerts.log           # Alert log
│   └── reports/
│       └── pm-weekly-YYYY-MM-DD.md  # Auto-generated PM report
├── [If Next.js selected]
│   ├── app/admin/llm-monitoring/page.tsx
│   ├── app/admin/llm-monitoring/users/[userId]/page.tsx
│   ├── app/api/runs/route.ts
│   ├── app/api/ranking/route.ts
│   ├── app/api/metrics/route.ts        # Prometheus endpoint
│   ├── components/llm-monitoring/
│   │   ├── KPICard.tsx
│   │   ├── TrendChart.tsx
│   │   ├── ModelCostBar.tsx
│   │   ├── LatencyGauge.tsx
│   │   ├── TokenDonut.tsx
│   │   ├── RankingTable.tsx
│   │   ├── InactiveUsers.tsx
│   │   ├── PMInsights.tsx
│   │   └── UserDetailPage.tsx
│   └── lib/llm-monitoring/db.ts
└── [If lightweight HTML selected]
    └── llm-monitoring/
        ├── index.html       # Single-file dashboard (charts + ranking + user detail)
        └── data/
            └── metrics.jsonl

运行工具后生成的文件结构:
./
├── safety-guard.sh          # 安全检查脚本(步骤0)
├── categorize_prompt.py     # Prompt自动分类模块
├── collect-metrics.sh       # 指标采集脚本(步骤2)
├── generate-pm-report.sh    # PM周报生成脚本(步骤5)
├── check-alerts.sh          # 成本告警脚本(步骤6)
├── data/
│   ├── metrics.jsonl        # 时间序列指标(JSONL格式)
│   ├── collect.log          # 采集日志
│   ├── alerts.log           # 告警日志
│   └── reports/
│       └── pm-weekly-YYYY-MM-DD.md  # 自动生成的PM周报
├── [如果选择Next.js方案]
│   ├── app/admin/llm-monitoring/page.tsx
│   ├── app/admin/llm-monitoring/users/[userId]/page.tsx
│   ├── app/api/runs/route.ts
│   ├── app/api/ranking/route.ts
│   ├── app/api/metrics/route.ts        # Prometheus端点
│   ├── components/llm-monitoring/
│   │   ├── KPICard.tsx
│   │   ├── TrendChart.tsx
│   │   ├── ModelCostBar.tsx
│   │   ├── LatencyGauge.tsx
│   │   ├── TokenDonut.tsx
│   │   ├── RankingTable.tsx
│   │   ├── InactiveUsers.tsx
│   │   ├── PMInsights.tsx
│   │   └── UserDetailPage.tsx
│   └── lib/llm-monitoring/db.ts
└── [如果选择轻量HTML方案]
    └── llm-monitoring/
        ├── index.html       # 单文件仪表盘(图表+排名+用户详情)
        └── data/
            └── metrics.jsonl

Constraints

约束规则

MUST

必须遵守

  • Always run Step 0 (
    safety-guard.sh
    ) first
  • Use
    --dry-run
    as the default; explicitly pass
    --allow-live
    for live API calls
  • Manage API keys via environment variables or
    .env
    files
  • Add
    .env
    to
    .gitignore
    :
    echo '.env' >> .gitignore
  • Use the 3-level color system (
    --color-ok
    ,
    --color-warn
    ,
    --color-danger
    ) consistently across all status indicators
  • Implement drilldown navigation so clicking a user link opens their personal detail page
  • Generate PM insights automatically from data (no hardcoding)
  • 永远先执行步骤0(
    safety-guard.sh
  • 默认使用
    --dry-run
    模式,实时API调用需要显式传入
    --allow-live
    参数
  • 通过环境变量或
    .env
    文件管理API密钥
  • .env
    加入
    .gitignore
    echo '.env' >> .gitignore
  • 所有状态指示器统一使用三级颜色系统(
    --color-ok
    --color-warn
    --color-danger
  • 实现下钻导航,点击用户链接可打开个人详情页
  • 从数据自动生成PM洞察,禁止硬编码

MUST NOT

严禁行为

  • Never hardcode API keys in source code, HTML, scripts, or log files
  • Never set live API calls (
    --allow-live
    ) as the default in automated scripts
  • Never use arbitrary colors — always use design token CSS variables
  • Never show status as text only — always pair with color and badge
  • Never store raw prompt text in the database (hashes only)

  • 严禁在源代码、HTML、脚本或日志文件中硬编码API密钥
  • 严禁在自动化脚本中将实时API调用(
    --allow-live
    )设为默认模式
  • 严禁使用随意的颜色,必须使用设计令牌CSS变量
  • 严禁仅用文字展示状态,必须同时搭配颜色和徽章
  • 严禁在数据库中存储原始Prompt文本,仅存储哈希值

Examples

使用示例

Example 1: Quick start (dry-run, no API key needed)

示例1:快速启动(空运行,无需API密钥)

bash
undefined
bash
undefined

1. Safety check

1. 安全检查

bash safety-guard.sh
bash safety-guard.sh

2. Install Tokuin

2. 安装Tokuin

3. Collect sample data (dry-run)

3. 采集示例数据(空运行)

export LLM_USER_ID="dev-alice" export LLM_USER_ALIAS="Alice" bash collect-metrics.sh "Analyze user behavior patterns" bash collect-metrics.sh "Write a Python function to parse JSON" bash collect-metrics.sh "Translate this document to English"
export LLM_USER_ID="dev-alice" export LLM_USER_ALIAS="Alice" bash collect-metrics.sh "Analyze user behavior patterns" bash collect-metrics.sh "Write a Python function to parse JSON" bash collect-metrics.sh "Translate this document to English"

4. Run lightweight dashboard

4. 运行轻量仪表盘

cd llm-monitoring && python3 -m http.server 3000 open http://localhost:3000
undefined
cd llm-monitoring && python3 -m http.server 3000 open http://localhost:3000
undefined

Example 2: Multi-user simulation (team test)

示例2:多用户模拟(团队测试)

bash
undefined
bash
undefined

Simulate multiple users with dry-run

用空运行模拟多个用户使用

for user in "alice" "backend" "analyst" "pm-charlie"; do export LLM_USER_ID="$user" export LLM_USER_ALIAS="$user" for category in "coding" "analysis" "translation"; do bash collect-metrics.sh "${category} related prompt example" done done
for user in "alice" "backend" "analyst" "pm-charlie"; do export LLM_USER_ID="$user" export LLM_USER_ALIAS="$user" for category in "coding" "analysis" "translation"; do bash collect-metrics.sh "${category} related prompt example" done done

Check results

查看结果

wc -l data/metrics.jsonl
undefined
wc -l data/metrics.jsonl
undefined

Example 3: Generate PM weekly report immediately

示例3:立即生成PM周报

bash
bash generate-pm-report.sh
cat reports/pm-weekly-$(date +%Y-%m-%d).md
bash
bash generate-pm-report.sh
cat reports/pm-weekly-$(date +%Y-%m-%d).md

Example 4: Test cost alert

示例4:测试成本告警

bash
export COST_THRESHOLD_USD=0.01   # Low threshold for testing
bash check-alerts.sh
bash
export COST_THRESHOLD_USD=0.01   # 设置低阈值用于测试
bash check-alerts.sh

Expected: ALERT message if cost exceeds threshold, otherwise "OK"

预期输出:如果成本超过阈值输出ALERT信息,否则输出"OK"


---

---

References

参考资料