ktx-ai-data-context-layer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesektx AI Data Context Layer Skill
ktx AI 数据上下文层 Skill
Overview
概述
ktx is a self-improving context layer that teaches AI agents how to query data warehouses accurately. It automatically builds and maintains approved metric definitions, detects joinable columns, absorbs business knowledge from wikis and semantic layers, and exposes everything through CLI and MCP (Model Context Protocol) tools.
Key capabilities:
- Ingests from dbt, MetricFlow, LookML, Looker, Metabase, Notion
- Auto-detects joinable columns and resolves fan/chasm traps
- Combines warehouse metadata, semantic layers, and wiki knowledge
- Exposes search and query tools via MCP for agent execution
- Read-only by design — never writes to your warehouse
Supported warehouses: PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, SQLite
ktx 是一个自我优化的上下文层,能够教会AI Agent如何精准查询数据仓库。它会自动构建并维护已核准的指标定义,检测可关联列,从知识库和语义层中吸收业务知识,并通过CLI和MCP(Model Context Protocol)工具对外提供所有功能。
核心功能:
- 支持从dbt、MetricFlow、LookML、Looker、Metabase、Notion导入数据
- 自动检测可关联列并解决扇形/鸿沟陷阱问题
- 整合仓库元数据、语义层和知识库内容
- 通过MCP提供搜索和查询工具供Agent调用
- 设计为只读模式——绝不会向你的数据仓库写入内容
支持的数据仓库: PostgreSQL、Snowflake、BigQuery、ClickHouse、MySQL、SQL Server、SQLite
Installation
安装
Global CLI Install
全局CLI安装
bash
npm install -g @kaelio/ktxbash
npm install -g @kaelio/ktxProject-Scoped Install
项目级安装
bash
npm install --save-dev @kaelio/ktx
npx ktx setupbash
npm install --save-dev @kaelio/ktx
npx ktx setupVerify Installation
验证安装
bash
ktx --version
ktx statusbash
ktx --version
ktx statusInitial Setup
初始设置
Interactive Setup Wizard
交互式设置向导
bash
ktx setupThe setup wizard will:
- Create or resume a project configuration
ktx.yaml - Configure LLM provider (Anthropic API, Vertex AI, AI Gateway, or Claude Code)
- Configure embedding provider (OpenAI, Voyage AI, or Google)
- Set up database connections (warehouse credentials)
- Configure context sources (dbt, Looker, Metabase, Notion, etc.)
- Run initial ingestion to build context
- Install agent integration (Codex, Claude Code, Cursor, OpenCode)
bash
ktx setup设置向导将完成以下操作:
- 创建或恢复 项目配置文件
ktx.yaml - 配置LLM提供商(Anthropic API、Vertex AI、AI Gateway或Claude Code)
- 配置嵌入提供商(OpenAI、Voyage AI或Google)
- 设置数据库连接(数据仓库凭证)
- 配置上下文源(dbt、Looker、Metabase、Notion等)
- 运行初始导入以构建上下文
- 安装Agent集成(Codex、Claude Code、Cursor、OpenCode)
Project Structure
项目结构
my-analytics-project/
├── ktx.yaml # Main configuration (commit this)
├── semantic-layer/warehouse/ # Generated semantic sources (commit)
├── wiki/global/ # Shared business context (commit)
├── wiki/user/alice/ # User-scoped notes (commit)
├── raw-sources/warehouse/ # Ingest artifacts (commit)
└── .ktx/ # Local state and secrets (git-ignore)my-analytics-project/
├── ktx.yaml # 主配置文件(需提交到版本控制)
├── semantic-layer/warehouse/ # 生成的语义源文件(需提交)
├── wiki/global/ # 共享业务上下文(需提交)
├── wiki/user/alice/ # 用户专属笔记(需提交)
├── raw-sources/warehouse/ # 导入产物(需提交)
└── .ktx/ # 本地状态和密钥文件(需忽略git提交)Manual Configuration
手动配置
Create :
ktx.yamlyaml
version: 1
project:
name: my-analytics
llm:
provider: anthropic
model: claude-sonnet-4-6
embeddings:
provider: openai
model: text-embedding-3-small
databases:
warehouse:
type: postgres
host: localhost
port: 5432
database: analytics
schema: public
# Credentials from env vars or .ktx/secrets.yaml
context_sources:
dbt_main:
type: dbt
project_dir: ../dbt-project
profiles_dir: ~/.dbt
target: prodStore secrets in (never commit):
.ktx/secrets.yamlyaml
databases:
warehouse:
user: ${POSTGRES_USER}
password: ${POSTGRES_PASSWORD}
llm:
anthropic_api_key: ${ANTHROPIC_API_KEY}
embeddings:
openai_api_key: ${OPENAI_API_KEY}创建 文件:
ktx.yamlyaml
version: 1
project:
name: my-analytics
llm:
provider: anthropic
model: claude-sonnet-4-6
embeddings:
provider: openai
model: text-embedding-3-small
databases:
warehouse:
type: postgres
host: localhost
port: 5432
database: analytics
schema: public
# 凭证来自环境变量或 .ktx/secrets.yaml
context_sources:
dbt_main:
type: dbt
project_dir: ../dbt-project
profiles_dir: ~/.dbt
target: prod将密钥存储在 (绝不要提交到版本控制):
.ktx/secrets.yamlyaml
databases:
warehouse:
user: ${POSTGRES_USER}
password: ${POSTGRES_PASSWORD}
llm:
anthropic_api_key: ${ANTHROPIC_API_KEY}
embeddings:
openai_api_key: ${OPENAI_API_KEY}Core Commands
核心命令
Check Project Status
检查项目状态
bash
ktx statusExample output:
ktx project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main, looker_main)
ktx context built: yes
Agent integration ready: yes (codex:project)bash
ktx status示例输出:
ktx project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main, looker_main)
ktx context built: yes
Agent integration ready: yes (codex:project)Build Context (Ingestion)
构建上下文(数据导入)
Ingest from all configured sources:
bash
ktx ingestIngest specific connection:
bash
ktx ingest --connection warehouseIngest specific context source:
bash
ktx ingest --source dbt_mainForce re-ingestion:
bash
ktx ingest --force从所有已配置的源导入数据:
bash
ktx ingest从指定连接导入数据:
bash
ktx ingest --connection warehouse从指定上下文源导入数据:
bash
ktx ingest --source dbt_main强制重新导入:
bash
ktx ingest --forceSearch Semantic Layer
搜索语义层
bash
undefinedbash
undefinedSearch for metrics and dimensions
搜索指标和维度
ktx sl "revenue"
ktx sl "customer lifetime value"
ktx sl "revenue"
ktx sl "customer lifetime value"
JSON output for scripting
输出JSON格式用于脚本开发
ktx sl "churn rate" --json
undefinedktx sl "churn rate" --json
undefinedSearch Wiki
搜索知识库
bash
undefinedbash
undefinedSearch business knowledge
搜索业务知识
ktx wiki "refund policy"
ktx wiki "how to calculate mrr"
ktx wiki "refund policy"
ktx wiki "how to calculate mrr"
JSON output
输出JSON格式
ktx wiki "data retention" --json
undefinedktx wiki "data retention" --json
undefinedStart MCP Server
启动MCP服务器
bash
undefinedbash
undefinedAuto-detect project and start MCP server
自动检测项目并启动MCP服务器
ktx mcp start
ktx mcp start
Specify project directory
指定项目目录
ktx mcp start --project-dir /path/to/project
ktx mcp start --project-dir /path/to/project
Custom port
自定义端口
ktx mcp start --port 3000
undefinedktx mcp start --port 3000
undefinedLLM Configuration
LLM配置
Anthropic API
Anthropic API
yaml
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY} # In .ktx/secrets.yamlyaml
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key: ${ANTHROPIC_API_KEY} # 配置在 .ktx/secrets.yamlGoogle Vertex AI
Google Vertex AI
yaml
llm:
provider: vertex
model: claude-sonnet-4-6@20250514
project_id: my-gcp-project
region: us-central1
# Uses Application Default Credentialsyaml
llm:
provider: vertex
model: claude-sonnet-4-6@20250514
project_id: my-gcp-project
region: us-central1
# 使用应用默认凭证AI Gateway
AI Gateway
yaml
llm:
provider: ai-gateway
model: claude-sonnet-4-6
base_url: https://gateway.example.com/v1
api_key: ${GATEWAY_API_KEY}yaml
llm:
provider: ai-gateway
model: claude-sonnet-4-6
base_url: https://gateway.example.com/v1
api_key: ${GATEWAY_API_KEY}Claude Code Session (Local)
Claude Code Session(本地)
yaml
llm:
provider: claude-code
# Uses local Claude Code session via SDKyaml
llm:
provider: claude-code
# 通过SDK使用本地Claude Code会话Database Connections
数据库连接
PostgreSQL
PostgreSQL
yaml
databases:
analytics:
type: postgres
host: localhost
port: 5432
database: analytics
schema: public
user: ${PG_USER}
password: ${PG_PASSWORD}yaml
databases:
analytics:
type: postgres
host: localhost
port: 5432
database: analytics
schema: public
user: ${PG_USER}
password: ${PG_PASSWORD}Snowflake
Snowflake
yaml
databases:
warehouse:
type: snowflake
account: xy12345.us-east-1
warehouse: COMPUTE_WH
database: ANALYTICS
schema: PUBLIC
user: ${SNOWFLAKE_USER}
password: ${SNOWFLAKE_PASSWORD}
role: ANALYSTyaml
databases:
warehouse:
type: snowflake
account: xy12345.us-east-1
warehouse: COMPUTE_WH
database: ANALYTICS
schema: PUBLIC
user: ${SNOWFLAKE_USER}
password: ${SNOWFLAKE_PASSWORD}
role: ANALYSTBigQuery
BigQuery
yaml
databases:
bigquery:
type: bigquery
project_id: my-gcp-project
dataset: analytics
credentials_path: ${GOOGLE_APPLICATION_CREDENTIALS}yaml
databases:
bigquery:
type: bigquery
project_id: my-gcp-project
dataset: analytics
credentials_path: ${GOOGLE_APPLICATION_CREDENTIALS}ClickHouse
ClickHouse
yaml
databases:
events:
type: clickhouse
host: localhost
port: 8123
database: analytics
user: ${CLICKHOUSE_USER}
password: ${CLICKHOUSE_PASSWORD}yaml
databases:
events:
type: clickhouse
host: localhost
port: 8123
database: analytics
user: ${CLICKHOUSE_USER}
password: ${CLICKHOUSE_PASSWORD}Context Sources
上下文源
dbt
dbt
yaml
context_sources:
dbt_prod:
type: dbt
project_dir: ../dbt-project
profiles_dir: ~/.dbt
target: prod
include_tests: true
include_docs: trueyaml
context_sources:
dbt_prod:
type: dbt
project_dir: ../dbt-project
profiles_dir: ~/.dbt
target: prod
include_tests: true
include_docs: trueLooker
Looker
yaml
context_sources:
looker_main:
type: looker
base_url: https://company.looker.com
client_id: ${LOOKER_CLIENT_ID}
client_secret: ${LOOKER_CLIENT_SECRET}
project: analyticsyaml
context_sources:
looker_main:
type: looker
base_url: https://company.looker.com
client_id: ${LOOKER_CLIENT_ID}
client_secret: ${LOOKER_CLIENT_SECRET}
project: analyticsMetabase
Metabase
yaml
context_sources:
metabase:
type: metabase
base_url: https://metabase.company.com
username: ${METABASE_USER}
password: ${METABASE_PASSWORD}
database_id: 1yaml
context_sources:
metabase:
type: metabase
base_url: https://metabase.company.com
username: ${METABASE_USER}
password: ${METABASE_PASSWORD}
database_id: 1Notion
Notion
yaml
context_sources:
notion_wiki:
type: notion
api_key: ${NOTION_API_KEY}
database_ids:
- abc123def456
- ghi789jkl012yaml
context_sources:
notion_wiki:
type: notion
api_key: ${NOTION_API_KEY}
database_ids:
- abc123def456
- ghi789jkl012Semantic Layer Usage
语义层使用
Search for Metrics
搜索指标
typescript
// Via CLI
$ ktx sl "monthly recurring revenue"Output structure:
json
{
"results": [
{
"type": "metric",
"name": "mrr",
"description": "Monthly Recurring Revenue",
"calculation": "SUM(subscription_amount)",
"dimensions": ["customer_id", "plan_type"],
"filters": ["status = 'active'"],
"source": "dbt_prod",
"score": 0.95
}
]
}typescript
// 通过CLI执行
$ ktx sl "monthly recurring revenue"输出结构:
json
{
"results": [
{
"type": "metric",
"name": "mrr",
"description": "Monthly Recurring Revenue",
"calculation": "SUM(subscription_amount)",
"dimensions": ["customer_id", "plan_type"],
"filters": ["status = 'active'"],
"source": "dbt_prod",
"score": 0.95
}
]
}Query Planning (Python API)
查询规划(Python API)
python
undefinedpython
undefinedInstall ktx semantic layer engine
安装ktx语义层引擎
uv add ktx-sl
uv add ktx-sl
from ktx_sl import SemanticLayer, Query
from ktx_sl import SemanticLayer, Query
Load semantic layer from ktx project
从ktx项目加载语义层
sl = SemanticLayer.from_project("/path/to/ktx-project")
sl = SemanticLayer.from_project("/path/to/ktx-project")
Define query
定义查询
query = Query(
metrics=["mrr", "customer_count"],
dimensions=["plan_type", "region"],
filters=[{"field": "signup_date", "op": ">=", "value": "2024-01-01"}],
order_by=[{"field": "mrr", "direction": "desc"}]
)
query = Query(
metrics=["mrr", "customer_count"],
dimensions=["plan_type", "region"],
filters=[{"field": "signup_date", "op": ">=", "value": "2024-01-01"}],
order_by=[{"field": "mrr", "direction": "desc"}]
)
Generate SQL
生成SQL语句
sql = sl.plan(query)
print(sql)
undefinedsql = sl.plan(query)
print(sql)
undefinedSemantic Layer YAML Format
语义层YAML格式
Example :
semantic-layer/warehouse/subscriptions.yamlyaml
version: 1
source: warehouse
table: public.subscriptions
metrics:
- name: mrr
description: Monthly Recurring Revenue
type: sum
sql: amount
filters:
- status = 'active'
- billing_period = 'monthly'
- name: customer_count
description: Distinct active customers
type: count_distinct
sql: customer_id
filters:
- status = 'active'
dimensions:
- name: customer_id
type: string
sql: customer_id
- name: plan_type
type: string
sql: plan_type
- name: region
type: string
sql: customer_region
joins:
- to: customers
type: left
on: subscriptions.customer_id = customers.id示例 :
semantic-layer/warehouse/subscriptions.yamlyaml
version: 1
source: warehouse
table: public.subscriptions
metrics:
- name: mrr
description: Monthly Recurring Revenue
type: sum
sql: amount
filters:
- status = 'active'
- billing_period = 'monthly'
- name: customer_count
description: Distinct active customers
type: count_distinct
sql: customer_id
filters:
- status = 'active'
dimensions:
- name: customer_id
type: string
sql: customer_id
- name: plan_type
type: string
sql: plan_type
- name: region
type: string
sql: customer_region
joins:
- to: customers
type: left
on: subscriptions.customer_id = customers.idWiki Management
知识库管理
Add Wiki Content Manually
手动添加知识库内容
Create :
wiki/global/refund-policy.mdmarkdown
---
title: Refund Policy
tags: [finance, customer-success]
---创建 :
wiki/global/refund-policy.mdmarkdown
---
title: Refund Policy
tags: [finance, customer-success]
---Refund Policy
退款政策
Customers can request refunds within 30 days of purchase.
客户可在购买后30天内申请退款。
Calculation Rules
计算规则
- Full refund: within 7 days
- Prorated refund: 8-30 days
- No refund: after 30 days
- 全额退款:购买后7天内
- 按比例退款:购买后8-30天
- 不予退款:购买30天后
Impact on Metrics
对指标的影响
Refunds reduce but not .
net_revenuegross_revenueundefined退款会减少 但不会影响 。
net_revenuegross_revenueundefinedSearch Wiki from CLI
从CLI搜索知识库
bash
ktx wiki "refund policy"
ktx wiki "how to calculate churn"bash
ktx wiki "refund policy"
ktx wiki "how to calculate churn"Wiki Structure
知识库结构
wiki/
├── global/ # Shared team knowledge
│ ├── metrics-glossary.md
│ ├── refund-policy.md
│ └── data-quality-rules.md
└── user/alice/ # User-scoped notes
└── analysis-notes.mdwiki/
├── global/ # 团队共享知识
│ ├── metrics-glossary.md
│ ├── refund-policy.md
│ └── data-quality-rules.md
└── user/alice/ # 用户专属笔记
└── analysis-notes.mdMCP Integration
MCP集成
Start MCP Server
启动MCP服务器
bash
undefinedbash
undefinedAuto-detect project
自动检测项目
ktx mcp start
ktx mcp start
Explicit project path
指定项目路径
ktx mcp start --project-dir /path/to/analytics
undefinedktx mcp start --project-dir /path/to/analytics
undefinedClaude Desktop Configuration
Claude Desktop配置
Add to :
~/Library/Application Support/Claude/claude_desktop_config.jsonjson
{
"mcpServers": {
"ktx-analytics": {
"command": "ktx",
"args": [
"mcp",
"start",
"--project-dir",
"/path/to/analytics"
]
}
}
}添加配置到 :
~/Library/Application Support/Claude/claude_desktop_config.jsonjson
{
"mcpServers": {
"ktx-analytics": {
"command": "ktx",
"args": [
"mcp",
"start",
"--project-dir",
"/path/to/analytics"
]
}
}
}Codex Integration
Codex集成
bash
undefinedbash
undefinedInstall ktx skill in Codex project
在Codex项目中安装ktx skill
npx skills add Kaelio/ktx --skill ktx
npx skills add Kaelio/ktx --skill ktx
Or via natural language
或通过自然语言指令
"Run npx skills add Kaelio/ktx --skill ktx and configure ktx for this project"
"Run npx skills add Kaelio/ktx --skill ktx and configure ktx for this project"
undefinedundefinedAvailable MCP Tools
可用的MCP工具
Once MCP server is running, agents can use:
- — Search metrics and dimensions
ktx_search_semantic_layer - — Search business knowledge
ktx_search_wiki - — Get detailed metric spec
ktx_get_metric_definition - — List configured warehouses
ktx_list_connections - — Validate SQL against semantic layer
ktx_validate_query
MCP服务器启动后,Agent可使用以下工具:
- — 搜索指标和维度
ktx_search_semantic_layer - — 搜索业务知识
ktx_search_wiki - — 获取详细指标规范
ktx_get_metric_definition - — 列出已配置的数据仓库
ktx_list_connections - — 根据语义层验证SQL语句
ktx_validate_query
Common Patterns
常用模式
Full Setup Workflow
完整设置流程
bash
undefinedbash
undefined1. Install ktx
1. 安装ktx
npm install -g @kaelio/ktx
npm install -g @kaelio/ktx
2. Create project
2. 创建项目
cd /path/to/analytics
ktx setup
cd /path/to/analytics
ktx setup
3. Configure via interactive prompts:
3. 通过交互式提示完成配置:
- LLM: Anthropic API with claude-sonnet-4-6
- LLM:使用claude-sonnet-4-6的Anthropic API
- Embeddings: OpenAI text-embedding-3-small
- 嵌入:OpenAI text-embedding-3-small
- Database: PostgreSQL connection
- 数据库:PostgreSQL连接
- Sources: dbt project
- 源:dbt项目
4. Verify setup
4. 验证设置
ktx status
ktx status
5. Build context
5. 构建上下文
ktx ingest
ktx ingest
6. Test search
6. 测试搜索功能
ktx sl "revenue"
ktx wiki "metric definitions"
ktx sl "revenue"
ktx wiki "metric definitions"
7. Start MCP for agents
7. 启动MCP供Agent使用
ktx mcp start
undefinedktx mcp start
undefinedIncremental Updates
增量更新
bash
undefinedbash
undefinedAdd new dbt models
添加新的dbt模型
cd ../dbt-project
dbt run --models new_model
cd ../dbt-project
dbt run --models new_model
Ingest updates into ktx
将更新导入ktx
cd ../analytics
ktx ingest --source dbt_prod
cd ../analytics
ktx ingest --source dbt_prod
Verify new metrics are available
验证新指标是否可用
ktx sl "new_metric"
undefinedktx sl "new_metric"
undefinedMulti-Warehouse Setup
多数据仓库设置
yaml
databases:
production:
type: postgres
host: prod.db.company.com
database: analytics
schema: public
user: ${PROD_DB_USER}
password: ${PROD_DB_PASSWORD}
events:
type: clickhouse
host: events.company.com
database: analytics
user: ${CLICKHOUSE_USER}
password: ${CLICKHOUSE_PASSWORD}
context_sources:
dbt_prod:
type: dbt
project_dir: ../dbt-project
target: prod
connection: production
events_raw:
type: raw
connection: eventsyaml
databases:
production:
type: postgres
host: prod.db.company.com
database: analytics
schema: public
user: ${PROD_DB_USER}
password: ${PROD_DB_PASSWORD}
events:
type: clickhouse
host: events.company.com
database: analytics
user: ${CLICKHOUSE_USER}
password: ${CLICKHOUSE_PASSWORD}
context_sources:
dbt_prod:
type: dbt
project_dir: ../dbt-project
target: prod
connection: production
events_raw:
type: raw
connection: eventsCI/CD Integration
CI/CD集成
yaml
undefinedyaml
undefined.github/workflows/ktx-ingest.yml
.github/workflows/ktx-ingest.yml
name: Update ktx Context
on:
push:
branches: [main]
paths:
- 'dbt-project/'
- 'wiki/'
jobs:
ingest:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Install ktx
run: npm install -g @kaelio/ktx
- name: Configure secrets
run: |
mkdir -p .ktx
echo "databases:" > .ktx/secrets.yaml
echo " warehouse:" >> .ktx/secrets.yaml
echo " user: ${{ secrets.DB_USER }}" >> .ktx/secrets.yaml
echo " password: ${{ secrets.DB_PASSWORD }}" >> .ktx/secrets.yaml
- name: Ingest context
run: ktx ingest --project-dir .
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- name: Commit updates
run: |
git config user.name "ktx-bot"
git config user.email "bot@company.com"
git add semantic-layer/ raw-sources/
git commit -m "Update ktx context" || exit 0
git pushundefinedname: Update ktx Context
on:
push:
branches: [main]
paths:
- 'dbt-project/'
- 'wiki/'
jobs:
ingest:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Install ktx
run: npm install -g @kaelio/ktx
- name: Configure secrets
run: |
mkdir -p .ktx
echo "databases:" > .ktx/secrets.yaml
echo " warehouse:" >> .ktx/secrets.yaml
echo " user: ${{ secrets.DB_USER }}" >> .ktx/secrets.yaml
echo " password: ${{ secrets.DB_PASSWORD }}" >> .ktx/secrets.yaml
- name: Ingest context
run: ktx ingest --project-dir .
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- name: Commit updates
run: |
git config user.name "ktx-bot"
git config user.email "bot@company.com"
git add semantic-layer/ raw-sources/
git commit -m "Update ktx context" || exit 0
git pushundefinedTroubleshooting
故障排除
"No LLM provider configured"
"No LLM provider configured"
Solution: Run to configure LLM provider, or manually add to :
ktx setupktx.yamlyaml
llm:
provider: anthropic
model: claude-sonnet-4-6Add API key to :
.ktx/secrets.yamlyaml
llm:
anthropic_api_key: ${ANTHROPIC_API_KEY}解决方案: 运行 配置LLM提供商,或手动添加到 :
ktx setupktx.yamlyaml
llm:
provider: anthropic
model: claude-sonnet-4-6将API密钥添加到 :
.ktx/secrets.yamlyaml
llm:
anthropic_api_key: ${ANTHROPIC_API_KEY}"Database connection failed"
"Database connection failed"
Check:
- Credentials in are correct
.ktx/secrets.yaml - Network access to database host
- Read-only user has SELECT permissions
Test connection:
bash
undefined检查项:
- 中的凭证是否正确
.ktx/secrets.yaml - 是否能够网络访问数据库主机
- 只读用户是否拥有SELECT权限
测试连接:
bash
undefinedPostgreSQL example
PostgreSQL示例
psql -h localhost -U ${POSTGRES_USER} -d analytics -c "SELECT 1"
undefinedpsql -h localhost -U ${POSTGRES_USER} -d analytics -c "SELECT 1"
undefined"ktx context built: no"
"ktx context built: no"
Solution: Run ingestion:
bash
ktx ingest --forceCheck for errors in output. Common issues:
- Missing database permissions
- Invalid dbt profiles.yml
- Unreachable context source URLs
解决方案: 运行导入命令:
bash
ktx ingest --force检查输出中的错误信息。常见问题:
- 缺少数据库权限
- dbt profiles.yml无效
- 上下文源URL无法访问
"No results found" when searching
"No results found" 搜索无结果
Check:
- Ingestion completed successfully:
ktx status - Semantic layer YAML files exist:
ls semantic-layer/ - Wiki content exists:
ls wiki/global/
Re-run ingestion:
bash
ktx ingest --source dbt_prod --force检查项:
- 导入是否成功完成:
ktx status - 语义层YAML文件是否存在:
ls semantic-layer/ - 知识库内容是否存在:
ls wiki/global/
重新运行导入:
bash
ktx ingest --source dbt_prod --forceMCP Server Not Starting
MCP服务器无法启动
Check:
- No other process on port 3000:
lsof -i :3000 - Project directory exists and has
ktx.yaml - Agent client configuration points to correct project path
Specify explicit project:
bash
ktx mcp start --project-dir /full/path/to/project检查项:
- 端口3000是否被其他进程占用:
lsof -i :3000 - 项目目录是否存在且包含
ktx.yaml - Agent客户端配置是否指向正确的项目路径
指定明确的项目路径:
bash
ktx mcp start --project-dir /full/path/to/projectdbt Ingestion Fails
dbt导入失败
Common issues:
- profiles.yml not found: Set in
profiles_dirktx.yaml - Target not found: Verify name matches profiles.yml
target - Compilation errors: Run manually to verify dbt project
dbt compile
Debug:
bash
undefined常见问题:
- profiles.yml未找到: 在 中设置
ktx.yamlprofiles_dir - Target未找到: 验证 名称与profiles.yml中的一致
target - 编译错误: 手动运行 验证dbt项目
dbt compile
调试:
bash
undefinedVerify dbt configuration
验证dbt配置
cd ../dbt-project
dbt debug --profiles-dir ~/.dbt --target prod
cd ../dbt-project
dbt debug --profiles-dir ~/.dbt --target prod
Check ktx can read dbt
检查ktx是否能读取dbt
ktx ingest --source dbt_prod --log-level debug
undefinedktx ingest --source dbt_prod --log-level debug
undefinedEmbedding Search Returns Poor Results
嵌入搜索结果质量不佳
Tune embedding quality:
- Use higher-quality embedding model:
yaml
embeddings:
provider: voyage
model: voyage-3- Add more descriptive metadata to semantic layer YAML:
yaml
metrics:
- name: mrr
description: Monthly Recurring Revenue from active subscriptions
tags: [revenue, subscription, saas]
business_context: |
MRR is the primary metric for tracking subscription health.
Includes only active, monthly-billed customers.优化嵌入质量:
- 使用更高质量的嵌入模型:
yaml
embeddings:
provider: voyage
model: voyage-3- 为语义层YAML添加更多描述性元数据:
yaml
metrics:
- name: mrr
description: Monthly Recurring Revenue from active subscriptions
tags: [revenue, subscription, saas]
business_context: |
MRR是跟踪订阅业务健康状况的核心指标。
仅包含活跃的月付客户。Large Projects Performance
大型项目性能优化
For projects with 100+ tables:
- Limit ingestion scope:
yaml
context_sources:
dbt_prod:
type: dbt
include_patterns:
- "marts/**"
- "metrics/**"
exclude_patterns:
- "staging/**"
- "tests/**"- Increase sampling limits:
yaml
databases:
warehouse:
type: postgres
sampling:
max_tables: 500
rows_per_table: 10000- Run incremental ingestion:
bash
undefined对于包含100+表的项目:
- 限制导入范围:
yaml
context_sources:
dbt_prod:
type: dbt
include_patterns:
- "marts/**"
- "metrics/**"
exclude_patterns:
- "staging/**"
- "tests/**"- 提高采样限制:
yaml
databases:
warehouse:
type: postgres
sampling:
max_tables: 500
rows_per_table: 10000- 运行增量导入:
bash
undefinedOnly ingest changed sources
仅导入有变化的源
ktx ingest --incremental
undefinedktx ingest --incremental
undefinedAdvanced Configuration
高级配置
Custom Sampling Strategy
自定义采样策略
yaml
databases:
warehouse:
type: postgres
sampling:
enabled: true
max_tables: 200
rows_per_table: 5000
include_patterns:
- "public.*"
- "analytics.*"
exclude_patterns:
- "*_tmp"
- "*_staging"yaml
databases:
warehouse:
type: postgres
sampling:
enabled: true
max_tables: 200
rows_per_table: 5000
include_patterns:
- "public.*"
- "analytics.*"
exclude_patterns:
- "*_tmp"
- "*_staging"Join Detection Tuning
关联检测调优
yaml
project:
join_detection:
enabled: true
min_confidence: 0.85
max_cardinality_ratio: 0.1
require_foreign_keys: falseyaml
project:
join_detection:
enabled: true
min_confidence: 0.85
max_cardinality_ratio: 0.1
require_foreign_keys: falseWiki Ingestion from Git
从Git导入知识库
yaml
context_sources:
team_wiki:
type: git
repo_url: https://github.com/company/analytics-wiki
branch: main
path: docs/
auth_token: ${GITHUB_TOKEN}yaml
context_sources:
team_wiki:
type: git
repo_url: https://github.com/company/analytics-wiki
branch: main
path: docs/
auth_token: ${GITHUB_TOKEN}Project Resolution
项目解析规则
ktx resolves project directory in this order:
- flag
--project-dir - environment variable
KTX_PROJECT_DIR - Nearest in current or parent directories
ktx.yaml - Current working directory
For scripting, always use explicit path:
bash
ktx ingest --project-dir /path/to/projectktx按以下顺序解析项目目录:
- 参数
--project-dir - 环境变量
KTX_PROJECT_DIR - 当前目录或父目录中最近的 文件
ktx.yaml - 当前工作目录
脚本开发时,请始终使用明确路径:
bash
ktx ingest --project-dir /path/to/projectEnvironment Variables
环境变量
| Variable | Purpose |
|---|---|
| Default project directory |
| Anthropic API key |
| OpenAI API key |
| PostgreSQL username |
| PostgreSQL password |
| Snowflake username |
| Snowflake password |
| GCP service account key path |
| GitHub personal access token |
Store in (git-ignored) or .
.env.ktx/secrets.yaml| 变量名 | 用途 |
|---|---|
| 默认项目目录 |
| Anthropic API密钥 |
| OpenAI API密钥 |
| PostgreSQL用户名 |
| PostgreSQL密码 |
| Snowflake用户名 |
| Snowflake密码 |
| GCP服务账号密钥路径 |
| GitHub个人访问令牌 |
可存储在 (需忽略git提交)或 中。
.env.ktx/secrets.yamlCLI Reference Summary
CLI参考摘要
| Command | Purpose |
|---|---|
| Create/resume project, configure providers |
| Check project readiness |
| Build context from configured sources |
| Search semantic layer |
| Search wiki knowledge |
| Start MCP server for agents |
| Show current configuration |
| Update configuration |
| List database connections |
| List context sources |
Full reference: https://docs.kaelio.com/ktx/docs/cli-reference/ktx
| 命令 | 用途 |
|---|---|
| 创建/恢复项目,配置提供商 |
| 检查项目就绪状态 |
| 从已配置源构建上下文 |
| 搜索语义层 |
| 搜索知识库 |
| 启动MCP服务器供Agent使用 |
| 显示当前配置 |
| 更新配置 |
| 列出数据库连接 |
| 列出上下文源 |