ktx-ai-data-agents
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesektx AI Data Agents Skill
ktx AI Data Agents Skill
Overview
概述
ktx is an executable context layer that teaches AI agents how to query data warehouses accurately. It automatically builds and maintains:
- Semantic layer with approved metric definitions, join graphs, and fan/chasm trap resolution
- Wiki knowledge from Notion, dbt docs, BI tools, and team documentation
- Warehouse metadata including table schemas, joinable columns, and usage patterns
- MCP integration for Claude Code, Codex, Cursor, and other AI agents
Unlike general-purpose agents that reinvent SQL logic on every query, ktx provides agents with canonical definitions and business context through a searchable interface.
ktx 是一个可执行的上下文层,用于教授AI代理如何准确查询数据仓库。它会自动构建并维护:
- 语义层:包含已核准的指标定义、关联图以及扇出/裂隙陷阱解决方案
- 知识库:来自Notion、dbt docs、BI工具和团队文档的内容
- 仓库元数据:包括表结构、可关联列和使用模式
- MCP集成:支持Claude Code、Codex、Cursor等AI代理
不同于每次查询都重新构建SQL逻辑的通用代理,ktx通过可搜索界面为代理提供标准定义和业务上下文。
Installation
安装
bash
undefinedbash
undefinedInstall globally
Install globally
npm install -g @kaelio/ktx
npm install -g @kaelio/ktx
Or use npx
Or use npx
npx @kaelio/ktx setup
**Requirements:**
- Node.js 18+
- Access to a SQL warehouse (PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, or SQLite)
- LLM API key (Anthropic, Google Vertex AI) or Claude Pro/Max subscriptionnpx @kaelio/ktx setup
**要求:**
- Node.js 18+
- 可访问SQL数据仓库(PostgreSQL、Snowflake、BigQuery、ClickHouse、MySQL、SQL Server或SQLite)
- LLM API密钥(Anthropic、Google Vertex AI)或Claude Pro/Max订阅Quick Start
快速开始
Initial Setup
初始设置
bash
undefinedbash
undefinedCreate or resume a ktx project in current directory
在当前目录创建或恢复ktx项目
ktx setup
ktx setup
Check project status
检查项目状态
ktx status
The `ktx setup` wizard will:
1. Create `ktx.yaml` configuration
2. Configure LLM and embedding providers
3. Set up database connections
4. Configure context sources (dbt, Looker, Metabase, Notion)
5. Build initial context
6. Install agent integration (MCP)ktx status
`ktx setup`向导会完成以下操作:
1. 创建`ktx.yaml`配置文件
2. 配置LLM和嵌入模型提供商
3. 设置数据库连接
4. 配置上下文来源(dbt、Looker、Metabase、Notion)
5. 构建初始上下文
6. 安装代理集成(MCP)Project Structure
项目结构
my-project/
├── ktx.yaml # Project configuration
├── semantic-layer/<connection-id>/ # YAML metric/dimension definitions
├── wiki/global/ # Shared business knowledge
├── wiki/user/<user-id>/ # User-scoped notes
├── raw-sources/<connection-id>/ # Ingest artifacts and reports
└── .ktx/ # Local state (git-ignored)Commit: , ,
Ignore:
ktx.yamlsemantic-layer/wiki/Ignore:
.ktx/my-project/
├── ktx.yaml # 项目配置
├── semantic-layer/<connection-id>/ # YAML指标/维度定义
├── wiki/global/ # 共享业务知识
├── wiki/user/<user-id>/ # 用户专属笔记
├── raw-sources/<connection-id>/ # 摄入的工件和报告
└── .ktx/ # 本地状态(Git忽略)需提交版本控制: , ,
需忽略:
ktx.yamlsemantic-layer/wiki/需忽略:
.ktx/Configuration
配置
ktx.yaml Structure
ktx.yaml结构
yaml
version: 1.0
name: my-analytics-project
llm:
provider: anthropic
model: claude-sonnet-4-6
embeddings:
provider: openai
model: text-embedding-3-small
connections:
- id: warehouse
type: postgres
host: ${DATABASE_HOST}
port: 5432
database: analytics
user: ${DATABASE_USER}
password: ${DATABASE_PASSWORD}
ssl: true
context_sources:
- id: dbt_main
type: dbt
connection_id: warehouse
manifest_path: ./target/manifest.json
catalog_path: ./target/catalog.json
- id: notion_docs
type: notion
token: ${NOTION_TOKEN}
page_ids:
- 3fa85f64-5717-4562-b3fc-2c963f66afa6yaml
version: 1.0
name: my-analytics-project
llm:
provider: anthropic
model: claude-sonnet-4-6
embeddings:
provider: openai
model: text-embedding-3-small
connections:
- id: warehouse
type: postgres
host: ${DATABASE_HOST}
port: 5432
database: analytics
user: ${DATABASE_USER}
password: ${DATABASE_PASSWORD}
ssl: true
context_sources:
- id: dbt_main
type: dbt
connection_id: warehouse
manifest_path: ./target/manifest.json
catalog_path: ./target/catalog.json
- id: notion_docs
type: notion
token: ${NOTION_TOKEN}
page_ids:
- 3fa85f64-5717-4562-b3fc-2c963f66afa6Environment Variables
环境变量
bash
undefinedbash
undefinedLLM providers
LLM提供商
export ANTHROPIC_API_KEY=your-key-here
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
export ANTHROPIC_API_KEY=your-key-here
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
Embeddings
嵌入模型
export OPENAI_API_KEY=your-key-here
export OPENAI_API_KEY=your-key-here
Database credentials
数据库凭证
export DATABASE_HOST=warehouse.example.com
export DATABASE_USER=readonly_user
export DATABASE_PASSWORD=secure-password
export DATABASE_HOST=warehouse.example.com
export DATABASE_USER=readonly_user
export DATABASE_PASSWORD=secure-password
Context sources
上下文来源
export NOTION_TOKEN=secret_notion_token
export LOOKER_API_TOKEN=looker-token
undefinedexport NOTION_TOKEN=secret_notion_token
export LOOKER_API_TOKEN=looker-token
undefinedSupported Databases
支持的数据库
PostgreSQL:
yaml
connections:
- id: postgres_warehouse
type: postgres
host: ${PG_HOST}
port: 5432
database: analytics
user: ${PG_USER}
password: ${PG_PASSWORD}
ssl: trueSnowflake:
yaml
connections:
- id: snowflake_warehouse
type: snowflake
account: ${SNOWFLAKE_ACCOUNT}
user: ${SNOWFLAKE_USER}
password: ${SNOWFLAKE_PASSWORD}
warehouse: COMPUTE_WH
database: ANALYTICS
schema: PUBLICBigQuery:
yaml
connections:
- id: bigquery_warehouse
type: bigquery
project_id: ${GCP_PROJECT_ID}
dataset: analytics
credentials_path: ${GOOGLE_APPLICATION_CREDENTIALS}PostgreSQL:
yaml
connections:
- id: postgres_warehouse
type: postgres
host: ${PG_HOST}
port: 5432
database: analytics
user: ${PG_USER}
password: ${PG_PASSWORD}
ssl: trueSnowflake:
yaml
connections:
- id: snowflake_warehouse
type: snowflake
account: ${SNOWFLAKE_ACCOUNT}
user: ${SNOWFLAKE_USER}
password: ${SNOWFLAKE_PASSWORD}
warehouse: COMPUTE_WH
database: ANALYTICS
schema: PUBLICBigQuery:
yaml
connections:
- id: bigquery_warehouse
type: bigquery
project_id: ${GCP_PROJECT_ID}
dataset: analytics
credentials_path: ${GOOGLE_APPLICATION_CREDENTIALS}Core Commands
核心命令
Context Management
上下文管理
bash
undefinedbash
undefinedBuild context from all configured sources
从所有配置的来源构建上下文
ktx ingest
ktx ingest
Build context for specific connection
为特定连接构建上下文
ktx ingest --connection warehouse
ktx ingest --connection warehouse
Force rebuild ignoring cache
强制重建,忽略缓存
ktx ingest --force
ktx ingest --force
Dry run to preview changes
试运行以预览变更
ktx ingest --dry-run
undefinedktx ingest --dry-run
undefinedSearch and Query
搜索与查询
bash
undefinedbash
undefinedSearch semantic layer (metrics, dimensions)
搜索语义层(指标、维度)
ktx sl "revenue"
ktx sl "customer churn rate"
ktx sl "revenue"
ktx sl "customer churn rate"
Search wiki knowledge
搜索知识库
ktx wiki "refund policy"
ktx wiki "data retention rules"
ktx wiki "refund policy"
ktx wiki "data retention rules"
Get detailed entity information
获取实体详细信息
ktx describe metric monthly_recurring_revenue
ktx describe dimension customer_segment
undefinedktx describe metric monthly_recurring_revenue
ktx describe dimension customer_segment
undefinedMCP Server
MCP服务器
bash
undefinedbash
undefinedStart MCP server for agent clients
启动供代理客户端使用的MCP服务器
ktx mcp start
ktx mcp start
Start with specific project directory
使用指定项目目录启动
ktx mcp start --project-dir /path/to/project
ktx mcp start --project-dir /path/to/project
Check MCP server status
检查MCP服务器状态
ktx mcp status
ktx mcp status
Stop MCP server
停止MCP服务器
ktx mcp stop
undefinedktx mcp stop
undefinedProject Management
项目管理
bash
undefinedbash
undefinedValidate configuration
验证配置
ktx validate
ktx validate
Show project status
显示项目状态
ktx status
ktx status
List all configured connections
列出所有已配置的连接
ktx connections list
ktx connections list
Test connection
测试连接
ktx connections test warehouse
undefinedktx connections test warehouse
undefinedSemantic Layer Usage
语义层使用
Defining Metrics
定义指标
Create :
semantic-layer/warehouse/metrics.yamlyaml
metrics:
- name: monthly_recurring_revenue
label: Monthly Recurring Revenue
description: Sum of all active subscription values normalized to monthly
type: sum
sql: |
CASE
WHEN billing_period = 'monthly' THEN amount
WHEN billing_period = 'annual' THEN amount / 12
END
table: subscriptions
filters:
- column: status
operator: equals
value: 'active'
dimensions:
- customer_segment
- plan_type
timestamp_column: created_at
- name: customer_count
label: Active Customers
description: Count of distinct active customer IDs
type: count_distinct
sql: customer_id
table: subscriptions
filters:
- column: status
operator: equals
value: 'active'创建:
semantic-layer/warehouse/metrics.yamlyaml
metrics:
- name: monthly_recurring_revenue
label: Monthly Recurring Revenue
description: Sum of all active subscription values normalized to monthly
type: sum
sql: |
CASE
WHEN billing_period = 'monthly' THEN amount
WHEN billing_period = 'annual' THEN amount / 12
END
table: subscriptions
filters:
- column: status
operator: equals
value: 'active'
dimensions:
- customer_segment
- plan_type
timestamp_column: created_at
- name: customer_count
label: Active Customers
description: Count of distinct active customer IDs
type: count_distinct
sql: customer_id
table: subscriptions
filters:
- column: status
operator: equals
value: 'active'Defining Dimensions
定义维度
Create :
semantic-layer/warehouse/dimensions.yamlyaml
dimensions:
- name: customer_segment
label: Customer Segment
description: Business vs. enterprise customer classification
type: categorical
sql: |
CASE
WHEN annual_revenue > 100000 THEN 'Enterprise'
WHEN annual_revenue > 10000 THEN 'Business'
ELSE 'Startup'
END
table: customers
- name: signup_date
label: Signup Date
description: Date customer first signed up
type: time
sql: DATE(created_at)
table: customers
granularities:
- day
- week
- month
- quarter
- year创建:
semantic-layer/warehouse/dimensions.yamlyaml
dimensions:
- name: customer_segment
label: Customer Segment
description: Business vs. enterprise customer classification
type: categorical
sql: |
CASE
WHEN annual_revenue > 100000 THEN 'Enterprise'
WHEN annual_revenue > 10000 THEN 'Business'
ELSE 'Startup'
END
table: customers
- name: signup_date
label: Signup Date
description: Date customer first signed up
type: time
sql: DATE(created_at)
table: customers
granularities:
- day
- week
- month
- quarter
- yearJoin Configuration
关联配置
Define table relationships in :
semantic-layer/warehouse/joins.yamlyaml
joins:
- left_table: subscriptions
right_table: customers
type: left
conditions:
- left_column: customer_id
right_column: id
- left_table: subscriptions
right_table: plans
type: left
conditions:
- left_column: plan_id
right_column: id在中定义表关系:
semantic-layer/warehouse/joins.yamlyaml
joins:
- left_table: subscriptions
right_table: customers
type: left
conditions:
- left_column: customer_id
right_column: id
- left_table: subscriptions
right_table: plans
type: left
conditions:
- left_column: plan_id
right_column: idWiki Management
知识库管理
Creating Wiki Pages
创建知识库页面
bash
undefinedbash
undefinedCreate global wiki page
创建全局知识库页面
cat > wiki/global/refund-policy.md << 'EOF'
cat > wiki/global/refund-policy.md << 'EOF'
Refund Policy
Refund Policy
Overview
Overview
Customers can request refunds within 30 days of purchase.
Customers can request refunds within 30 days of purchase.
Rules
Rules
- Full refund: < 7 days
- Prorated refund: 7-30 days
- No refund: > 30 days
- Full refund: < 7 days
- Prorated refund: 7-30 days
- No refund: > 30 days
Database Impact
Database Impact
Refunds update to 'refunded' and create
negative entries in .
EOF
transactions.statusrevenue_eventsRefunds update to 'refunded' and create
negative entries in .
EOF
transactions.statusrevenue_eventsCreate user-scoped note
创建用户专属笔记
mkdir -p wiki/user/$(whoami)
cat > wiki/user/$(whoami)/analysis-notes.md << 'EOF'
mkdir -p wiki/user/$(whoami)
cat > wiki/user/$(whoami)/analysis-notes.md << 'EOF'
Analysis Notes
Analysis Notes
2025-05 Revenue Analysis
2025-05 Revenue Analysis
Found discrepancy in EMEA revenue - missing Stripe events.
Tracked in JIRA-1234.
EOF
undefinedFound discrepancy in EMEA revenue - missing Stripe events.
Tracked in JIRA-1234.
EOF
undefinedIngesting from Notion
从Notion摄入内容
yaml
context_sources:
- id: product_docs
type: notion
token: ${NOTION_TOKEN}
page_ids:
- 3fa85f64-5717-4562-b3fc-2c963f66afa6 # Product Roadmap
- 7c9e6679-7425-40de-944b-e07fc1f90ae7 # Data Dictionary
recursive: true # Include child pagesRun to sync Notion content into .
ktx ingestwiki/global/yaml
context_sources:
- id: product_docs
type: notion
token: ${NOTION_TOKEN}
page_ids:
- 3fa85f64-5717-4562-b3fc-2c963f66afa6 # Product Roadmap
- 7c9e6679-7425-40de-944b-e07fc1f90ae7 # Data Dictionary
recursive: true # 包含子页面运行将Notion内容同步到。
ktx ingestwiki/global/Agent Integration (MCP)
代理集成(MCP)
Starting MCP Server
启动MCP服务器
bash
undefinedbash
undefinedIn your ktx project directory
在你的ktx项目目录中
ktx mcp start
ktx mcp start
Or specify project location
或者指定项目位置
ktx mcp start --project-dir ~/my-analytics
undefinedktx mcp start --project-dir ~/my-analytics
undefinedClaude Desktop Configuration
Claude Desktop配置
Add to :
~/Library/Application Support/Claude/claude_desktop_config.jsonjson
{
"mcpServers": {
"ktx": {
"command": "ktx",
"args": ["mcp", "start", "--project-dir", "/absolute/path/to/project"]
}
}
}Restart Claude Desktop to load the MCP server.
将以下内容添加到:
~/Library/Application Support/Claude/claude_desktop_config.jsonjson
{
"mcpServers": {
"ktx": {
"command": "ktx",
"args": ["mcp", "start", "--project-dir", "/absolute/path/to/project"]
}
}
}重启Claude Desktop以加载MCP服务器。
Using ktx from Claude
在Claude中使用ktx
Once configured, you can prompt Claude:
What was our MRR last month by customer segment?Claude will use ktx MCP tools to:
- Search semantic layer for metric
monthly_recurring_revenue - Find relevant dimensions ()
customer_segment - Retrieve approved SQL definitions
- Execute query using canonical metric logic
配置完成后,你可以向Claude发送提示:
What was our MRR last month by customer segment?Claude会使用ktx MCP工具完成以下操作:
- 在语义层中搜索指标
monthly_recurring_revenue - 找到相关维度()
customer_segment - 获取已核准的SQL定义
- 使用标准指标逻辑执行查询
Available MCP Tools
可用的MCP工具
- - Search metrics, dimensions, tables
ktx_search_semantic_layer - - Search business knowledge and documentation
ktx_search_wiki - - Get detailed entity information
ktx_describe_entity - - List available database connections
ktx_list_connections - - Get project context overview
ktx_get_context_summary
- - 搜索指标、维度、表
ktx_search_semantic_layer - - 搜索业务知识和文档
ktx_search_wiki - - 获取实体详细信息
ktx_describe_entity - - 列出可用数据库连接
ktx_list_connections - - 获取项目上下文概述
ktx_get_context_summary
Common Patterns
常见模式
Initial Project Setup Workflow
初始项目设置流程
bash
undefinedbash
undefined1. Navigate to analytics project
1. 导航到分析项目
cd ~/projects/analytics
cd ~/projects/analytics
2. Run setup wizard
2. 运行设置向导
ktx setup
ktx setup
Select: Anthropic Claude, OpenAI embeddings, configure Postgres connection
选择:Anthropic Claude、OpenAI嵌入模型、配置Postgres连接
3. Add dbt context source
3. 添加dbt上下文来源
Edit ktx.yaml to add dbt manifest/catalog paths
编辑ktx.yaml添加dbt manifest/catalog路径
4. Build context
4. 构建上下文
ktx ingest
ktx ingest
5. Verify
5. 验证
ktx status
ktx sl "revenue"
ktx status
ktx sl "revenue"
6. Start MCP for agents
6. 为代理启动MCP
ktx mcp start
undefinedktx mcp start
undefinedIncremental Context Updates
增量上下文更新
bash
undefinedbash
undefinedAfter dbt run or schema changes
在dbt运行或 schema变更后
dbt run
dbt docs generate
ktx ingest --connection warehouse
dbt run
dbt docs generate
ktx ingest --connection warehouse
After updating wiki pages
在更新知识库页面后
ktx ingest --source notion_docs
ktx ingest --source notion_docs
Check for conflicts or issues
检查冲突或问题
ktx validate
undefinedktx validate
undefinedSearching Before Agent Queries
在代理查询前进行搜索
bash
undefinedbash
undefinedFind available metrics
查找可用指标
ktx sl "churn"
ktx sl "churn"
Output:
输出:
Metrics:
Metrics:
- customer_churn_rate (Monthly customer churn percentage)
- customer_churn_rate (Monthly customer churn percentage)
- mrr_churn (Monthly recurring revenue lost to churn)
- mrr_churn (Monthly recurring revenue lost to churn)
Dimensions:
Dimensions:
- churn_reason (Categorical reason for cancellation)
- churn_reason (Categorical reason for cancellation)
Get metric details
获取指标详情
ktx describe metric customer_churn_rate
ktx describe metric customer_churn_rate
Now prompt agent with context:
现在向代理发送带上下文的提示:
"Calculate customer_churn_rate for Q1 2025 by churn_reason"
"Calculate customer_churn_rate for Q1 2025 by churn_reason"
undefinedundefinedMulti-Warehouse Setup
多仓库设置
yaml
connections:
- id: production
type: snowflake
account: ${SNOWFLAKE_PROD_ACCOUNT}
# ... prod credentials
- id: staging
type: snowflake
account: ${SNOWFLAKE_STAGING_ACCOUNT}
# ... staging credentials
context_sources:
- id: dbt_prod
type: dbt
connection_id: production
manifest_path: ./prod/target/manifest.json
- id: dbt_staging
type: dbt
connection_id: staging
manifest_path: ./staging/target/manifest.jsonbash
undefinedyaml
connections:
- id: production
type: snowflake
account: ${SNOWFLAKE_PROD_ACCOUNT}
# ... 生产环境凭证
- id: staging
type: snowflake
account: ${SNOWFLAKE_STAGING_ACCOUNT}
# ... 预发布环境凭证
context_sources:
- id: dbt_prod
type: dbt
connection_id: production
manifest_path: ./prod/target/manifest.json
- id: dbt_staging
type: dbt
connection_id: staging
manifest_path: ./staging/target/manifest.jsonbash
undefinedIngest specific warehouse
摄入特定仓库
ktx ingest --connection production
ktx ingest --connection production
Search scoped to connection
按连接范围搜索
ktx sl "revenue" --connection production
undefinedktx sl "revenue" --connection production
undefinedTroubleshooting
故障排除
MCP Server Not Starting
MCP服务器无法启动
Check project directory:
bash
ktx status检查项目目录:
bash
ktx statusIf shows "ktx mcp start --project-dir ...", copy and run that command
如果显示 "ktx mcp start --project-dir ...",复制并运行该命令
**Verify ktx.yaml exists:**
```bash
ls ktx.yaml
**验证ktx.yaml是否存在:**
```bash
ls ktx.yamlIf missing, run: ktx setup
如果缺失,运行:ktx setup
**Check Claude Desktop config:**
```bash
cat ~/Library/Application\ Support/Claude/claude_desktop_config.json
**检查Claude Desktop配置:**
```bash
cat ~/Library/Application\ Support/Claude/claude_desktop_config.jsonEnsure path is absolute, not relative
确保路径是绝对路径,而非相对路径
undefinedundefinedDatabase Connection Issues
数据库连接问题
bash
undefinedbash
undefinedTest connection directly
直接测试连接
ktx connections test warehouse
ktx connections test warehouse
Common fixes:
常见修复方法:
- Verify environment variables are set
- 验证环境变量已设置
- Check firewall/VPN for warehouse access
- 检查防火墙/VPN是否允许访问仓库
- Ensure user has SELECT permissions
- 确保用户拥有SELECT权限
- For Snowflake, verify warehouse is running
- 对于Snowflake,验证仓库正在运行
undefinedundefinedContext Ingestion Failures
上下文摄入失败
bash
undefinedbash
undefinedEnable verbose logging
启用详细日志
ktx ingest --verbose
ktx ingest --verbose
Check specific source
检查特定来源
ktx ingest --source dbt_main --verbose
ktx ingest --source dbt_main --verbose
Validate configuration
验证配置
ktx validate
ktx validate
Common issues:
常见问题:
- dbt manifest/catalog paths incorrect
- dbt manifest/catalog路径不正确
- Missing environment variables
- 缺失环境变量
- LLM API rate limits (retry with backoff)
- LLM API速率限制(重试并使用退避策略)
undefinedundefinedSemantic Layer Errors
语义层错误
Undefined metric:
bash
undefined未定义指标:
bash
undefinedList all metrics
列出所有指标
ktx sl "*" --type metric
ktx sl "*" --type metric
Ensure YAML is valid
确保YAML格式有效
cat semantic-layer/warehouse/metrics.yaml
ktx validate
**Join graph errors:**
```bashcat semantic-layer/warehouse/metrics.yaml
ktx validate
**关联图错误:**
```bashCheck join definitions
检查关联定义
cat semantic-layer/warehouse/joins.yaml
cat semantic-layer/warehouse/joins.yaml
Common issues:
常见问题:
- Missing join between tables
- 表之间缺失关联
- Ambiguous join paths (fan trap)
- 关联路径不明确(扇出陷阱)
- Incorrect column names
- 列名不正确
undefinedundefinedAgent Not Finding Context
代理无法找到上下文
Restart MCP server:
bash
ktx mcp stop
ktx mcp startRebuild context:
bash
ktx ingest --forceCheck search results:
bash
ktx sl "your search term"
ktx wiki "your search term"重启MCP服务器:
bash
ktx mcp stop
ktx mcp start重建上下文:
bash
ktx ingest --force检查搜索结果:
bash
ktx sl "your search term"
ktx wiki "your search term"If empty, context may not have been ingested
如果为空,可能上下文未被摄入
undefinedundefinedTypeScript API Usage
TypeScript API使用
For programmatic usage in Node.js:
typescript
import { KtxProject } from '@kaelio/ktx';
// Load project
const project = await KtxProject.load('/path/to/project');
// Search semantic layer
const metrics = await project.searchSemanticLayer('revenue', {
type: 'metric',
limit: 10
});
// Search wiki
const wikiPages = await project.searchWiki('refund policy', {
scope: 'global',
limit: 5
});
// Get entity details
const metric = await project.describeEntity('metric', 'monthly_recurring_revenue');
// Execute ingestion
await project.ingest({
connectionId: 'warehouse',
force: false
});在Node.js中以编程方式使用:
typescript
import { KtxProject } from '@kaelio/ktx';
// 加载项目
const project = await KtxProject.load('/path/to/project');
// 搜索语义层
const metrics = await project.searchSemanticLayer('revenue', {
type: 'metric',
limit: 10
});
// 搜索知识库
const wikiPages = await project.searchWiki('refund policy', {
scope: 'global',
limit: 5
});
// 获取实体详情
const metric = await project.describeEntity('metric', 'monthly_recurring_revenue');
// 执行摄入
await project.ingest({
connectionId: 'warehouse',
force: false
});Best Practices
最佳实践
- Version control semantic layer: Commit and
semantic-layer/to track metric definitionswiki/global/ - Use environment variables: Never hardcode credentials in
ktx.yaml - Regular ingestion: Run after dbt runs or schema changes
ktx ingest - Descriptive names: Use clear metric/dimension names that match business language
- Document assumptions: Add descriptions to all semantic layer entities
- Test connections: Use before ingestion
ktx connections test - User wiki for ephemeral notes: Keep temporary analysis in
wiki/user/ - Review ingest logs: Check for warnings about duplicate or conflicting definitions
- 版本控制语义层: 将和
semantic-layer/提交到版本控制,以跟踪指标定义wiki/global/ - 使用环境变量: 切勿在中硬编码凭证
ktx.yaml - 定期摄入: 在dbt运行或schema变更后运行
ktx ingest - 使用描述性名称: 使用与业务语言匹配的清晰指标/维度名称
- 记录假设: 为所有语义层实体添加描述
- 测试连接: 在摄入前使用
ktx connections test - 用户知识库用于临时笔记: 将临时分析内容保存在中
wiki/user/ - 查看摄入日志: 检查关于重复或冲突定义的警告
Advanced Configuration
高级配置
Custom LLM Configuration
自定义LLM配置
yaml
llm:
provider: vertex
project_id: ${GCP_PROJECT_ID}
location: us-central1
model: claude-3-5-sonnet@20241022
max_tokens: 8192
temperature: 0.0yaml
llm:
provider: vertex
project_id: ${GCP_PROJECT_ID}
location: us-central1
model: claude-3-5-sonnet@20241022
max_tokens: 8192
temperature: 0.0Embedding Configuration
嵌入模型配置
yaml
embeddings:
provider: openai
model: text-embedding-3-large
dimensions: 1536
batch_size: 100yaml
embeddings:
provider: openai
model: text-embedding-3-large
dimensions: 1536
batch_size: 100Ingest Scheduling
摄入调度
bash
undefinedbash
undefinedCron example: Daily at 2 AM
Cron示例:每天凌晨2点
0 2 * * * cd /path/to/project && ktx ingest --connection warehouse >> /var/log/ktx.log 2>&1
undefined0 2 * * * cd /path/to/project && ktx ingest --connection warehouse >> /var/log/ktx.log 2>&1
undefinedCustom Python Queries
自定义Python查询
ktx includes a Python semantic layer query planner:
python
from ktx_sl import SemanticLayerktx包含一个Python语义层查询规划器:
python
from ktx_sl import SemanticLayerLoad semantic layer
加载语义层
sl = SemanticLayer.from_project("/path/to/project")
sl = SemanticLayer.from_project("/path/to/project")
Build query
构建查询
query = sl.query(
metrics=["monthly_recurring_revenue"],
dimensions=["customer_segment", "signup_month"],
filters=[
{"dimension": "signup_date", "operator": ">=", "value": "2025-01-01"}
],
order_by=[{"metric": "monthly_recurring_revenue", "desc": True}]
)
query = sl.query(
metrics=["monthly_recurring_revenue"],
dimensions=["customer_segment", "signup_month"],
filters=[
{"dimension": "signup_date", "operator": ">=", "value": "2025-01-01"}
],
order_by=[{"metric": "monthly_recurring_revenue", "desc": True}]
)
Get SQL
获取SQL
sql = query.to_sql()
print(sql)
undefinedsql = query.to_sql()
print(sql)
undefinedResources
资源
- Documentation: https://docs.kaelio.com/ktx
- CLI Reference: https://docs.kaelio.com/ktx/docs/cli-reference/ktx
- Agent Setup: https://docs.kaelio.com/ktx/docs/ai-resources/agent-quickstart
- Slack Community: https://join.slack.com/t/ktxcommunity/shared_invite/zt-3y9b44m1x-LVyNNJD5nwaZHq4XS29LMQ
- GitHub Issues: https://github.com/Kaelio/ktx/issues
- 文档: https://docs.kaelio.com/ktx
- CLI参考: https://docs.kaelio.com/ktx/docs/cli-reference/ktx
- 代理设置: https://docs.kaelio.com/ktx/docs/ai-resources/agent-quickstart
- Slack社区: https://join.slack.com/t/ktxcommunity/shared_invite/zt-3y9b44m1x-LVyNNJD5nwaZHq4XS29LMQ
- GitHub Issues: https://github.com/Kaelio/ktx/issues