ktx-ai-data-agents

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ktx AI Data Agents Skill

ktx AI Data Agents Skill

Skill by ara.so — AI Agent Skills collection.
ara.so提供的Skill —— AI Agent Skill合集。

Overview

概述

ktx is an executable context layer that teaches AI agents how to query data warehouses accurately. It automatically builds and maintains:
  • Semantic layer with approved metric definitions, join graphs, and fan/chasm trap resolution
  • Wiki knowledge from Notion, dbt docs, BI tools, and team documentation
  • Warehouse metadata including table schemas, joinable columns, and usage patterns
  • MCP integration for Claude Code, Codex, Cursor, and other AI agents
Unlike general-purpose agents that reinvent SQL logic on every query, ktx provides agents with canonical definitions and business context through a searchable interface.
ktx 是一个可执行的上下文层,用于教授AI代理如何准确查询数据仓库。它会自动构建并维护:
  • 语义层:包含已核准的指标定义、关联图以及扇出/裂隙陷阱解决方案
  • 知识库:来自Notion、dbt docs、BI工具和团队文档的内容
  • 仓库元数据:包括表结构、可关联列和使用模式
  • MCP集成:支持Claude Code、Codex、Cursor等AI代理
不同于每次查询都重新构建SQL逻辑的通用代理,ktx通过可搜索界面为代理提供标准定义和业务上下文。

Installation

安装

bash
undefined
bash
undefined

Install globally

Install globally

npm install -g @kaelio/ktx
npm install -g @kaelio/ktx

Or use npx

Or use npx

npx @kaelio/ktx setup

**Requirements:**
- Node.js 18+
- Access to a SQL warehouse (PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, or SQLite)
- LLM API key (Anthropic, Google Vertex AI) or Claude Pro/Max subscription
npx @kaelio/ktx setup

**要求:**
- Node.js 18+
- 可访问SQL数据仓库(PostgreSQL、Snowflake、BigQuery、ClickHouse、MySQL、SQL Server或SQLite)
- LLM API密钥(Anthropic、Google Vertex AI)或Claude Pro/Max订阅

Quick Start

快速开始

Initial Setup

初始设置

bash
undefined
bash
undefined

Create or resume a ktx project in current directory

在当前目录创建或恢复ktx项目

ktx setup
ktx setup

Check project status

检查项目状态

ktx status

The `ktx setup` wizard will:
1. Create `ktx.yaml` configuration
2. Configure LLM and embedding providers
3. Set up database connections
4. Configure context sources (dbt, Looker, Metabase, Notion)
5. Build initial context
6. Install agent integration (MCP)
ktx status

`ktx setup`向导会完成以下操作:
1. 创建`ktx.yaml`配置文件
2. 配置LLM和嵌入模型提供商
3. 设置数据库连接
4. 配置上下文来源(dbt、Looker、Metabase、Notion)
5. 构建初始上下文
6. 安装代理集成(MCP)

Project Structure

项目结构

my-project/
├── ktx.yaml                         # Project configuration
├── semantic-layer/<connection-id>/  # YAML metric/dimension definitions
├── wiki/global/                     # Shared business knowledge
├── wiki/user/<user-id>/             # User-scoped notes
├── raw-sources/<connection-id>/     # Ingest artifacts and reports
└── .ktx/                            # Local state (git-ignored)
Commit:
ktx.yaml
,
semantic-layer/
,
wiki/

Ignore:
.ktx/
my-project/
├── ktx.yaml                         # 项目配置
├── semantic-layer/<connection-id>/  # YAML指标/维度定义
├── wiki/global/                     # 共享业务知识
├── wiki/user/<user-id>/             # 用户专属笔记
├── raw-sources/<connection-id>/     # 摄入的工件和报告
└── .ktx/                            # 本地状态(Git忽略)
需提交版本控制:
ktx.yaml
,
semantic-layer/
,
wiki/

需忽略:
.ktx/

Configuration

配置

ktx.yaml Structure

ktx.yaml结构

yaml
version: 1.0
name: my-analytics-project
llm:
  provider: anthropic
  model: claude-sonnet-4-6
embeddings:
  provider: openai
  model: text-embedding-3-small
connections:
  - id: warehouse
    type: postgres
    host: ${DATABASE_HOST}
    port: 5432
    database: analytics
    user: ${DATABASE_USER}
    password: ${DATABASE_PASSWORD}
    ssl: true
context_sources:
  - id: dbt_main
    type: dbt
    connection_id: warehouse
    manifest_path: ./target/manifest.json
    catalog_path: ./target/catalog.json
  - id: notion_docs
    type: notion
    token: ${NOTION_TOKEN}
    page_ids:
      - 3fa85f64-5717-4562-b3fc-2c963f66afa6
yaml
version: 1.0
name: my-analytics-project
llm:
  provider: anthropic
  model: claude-sonnet-4-6
embeddings:
  provider: openai
  model: text-embedding-3-small
connections:
  - id: warehouse
    type: postgres
    host: ${DATABASE_HOST}
    port: 5432
    database: analytics
    user: ${DATABASE_USER}
    password: ${DATABASE_PASSWORD}
    ssl: true
context_sources:
  - id: dbt_main
    type: dbt
    connection_id: warehouse
    manifest_path: ./target/manifest.json
    catalog_path: ./target/catalog.json
  - id: notion_docs
    type: notion
    token: ${NOTION_TOKEN}
    page_ids:
      - 3fa85f64-5717-4562-b3fc-2c963f66afa6

Environment Variables

环境变量

bash
undefined
bash
undefined

LLM providers

LLM提供商

export ANTHROPIC_API_KEY=your-key-here export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
export ANTHROPIC_API_KEY=your-key-here export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

Embeddings

嵌入模型

export OPENAI_API_KEY=your-key-here
export OPENAI_API_KEY=your-key-here

Database credentials

数据库凭证

export DATABASE_HOST=warehouse.example.com export DATABASE_USER=readonly_user export DATABASE_PASSWORD=secure-password
export DATABASE_HOST=warehouse.example.com export DATABASE_USER=readonly_user export DATABASE_PASSWORD=secure-password

Context sources

上下文来源

export NOTION_TOKEN=secret_notion_token export LOOKER_API_TOKEN=looker-token
undefined
export NOTION_TOKEN=secret_notion_token export LOOKER_API_TOKEN=looker-token
undefined

Supported Databases

支持的数据库

PostgreSQL:
yaml
connections:
  - id: postgres_warehouse
    type: postgres
    host: ${PG_HOST}
    port: 5432
    database: analytics
    user: ${PG_USER}
    password: ${PG_PASSWORD}
    ssl: true
Snowflake:
yaml
connections:
  - id: snowflake_warehouse
    type: snowflake
    account: ${SNOWFLAKE_ACCOUNT}
    user: ${SNOWFLAKE_USER}
    password: ${SNOWFLAKE_PASSWORD}
    warehouse: COMPUTE_WH
    database: ANALYTICS
    schema: PUBLIC
BigQuery:
yaml
connections:
  - id: bigquery_warehouse
    type: bigquery
    project_id: ${GCP_PROJECT_ID}
    dataset: analytics
    credentials_path: ${GOOGLE_APPLICATION_CREDENTIALS}
PostgreSQL:
yaml
connections:
  - id: postgres_warehouse
    type: postgres
    host: ${PG_HOST}
    port: 5432
    database: analytics
    user: ${PG_USER}
    password: ${PG_PASSWORD}
    ssl: true
Snowflake:
yaml
connections:
  - id: snowflake_warehouse
    type: snowflake
    account: ${SNOWFLAKE_ACCOUNT}
    user: ${SNOWFLAKE_USER}
    password: ${SNOWFLAKE_PASSWORD}
    warehouse: COMPUTE_WH
    database: ANALYTICS
    schema: PUBLIC
BigQuery:
yaml
connections:
  - id: bigquery_warehouse
    type: bigquery
    project_id: ${GCP_PROJECT_ID}
    dataset: analytics
    credentials_path: ${GOOGLE_APPLICATION_CREDENTIALS}

Core Commands

核心命令

Context Management

上下文管理

bash
undefined
bash
undefined

Build context from all configured sources

从所有配置的来源构建上下文

ktx ingest
ktx ingest

Build context for specific connection

为特定连接构建上下文

ktx ingest --connection warehouse
ktx ingest --connection warehouse

Force rebuild ignoring cache

强制重建,忽略缓存

ktx ingest --force
ktx ingest --force

Dry run to preview changes

试运行以预览变更

ktx ingest --dry-run
undefined
ktx ingest --dry-run
undefined

Search and Query

搜索与查询

bash
undefined
bash
undefined

Search semantic layer (metrics, dimensions)

搜索语义层(指标、维度)

ktx sl "revenue" ktx sl "customer churn rate"
ktx sl "revenue" ktx sl "customer churn rate"

Search wiki knowledge

搜索知识库

ktx wiki "refund policy" ktx wiki "data retention rules"
ktx wiki "refund policy" ktx wiki "data retention rules"

Get detailed entity information

获取实体详细信息

ktx describe metric monthly_recurring_revenue ktx describe dimension customer_segment
undefined
ktx describe metric monthly_recurring_revenue ktx describe dimension customer_segment
undefined

MCP Server

MCP服务器

bash
undefined
bash
undefined

Start MCP server for agent clients

启动供代理客户端使用的MCP服务器

ktx mcp start
ktx mcp start

Start with specific project directory

使用指定项目目录启动

ktx mcp start --project-dir /path/to/project
ktx mcp start --project-dir /path/to/project

Check MCP server status

检查MCP服务器状态

ktx mcp status
ktx mcp status

Stop MCP server

停止MCP服务器

ktx mcp stop
undefined
ktx mcp stop
undefined

Project Management

项目管理

bash
undefined
bash
undefined

Validate configuration

验证配置

ktx validate
ktx validate

Show project status

显示项目状态

ktx status
ktx status

List all configured connections

列出所有已配置的连接

ktx connections list
ktx connections list

Test connection

测试连接

ktx connections test warehouse
undefined
ktx connections test warehouse
undefined

Semantic Layer Usage

语义层使用

Defining Metrics

定义指标

Create
semantic-layer/warehouse/metrics.yaml
:
yaml
metrics:
  - name: monthly_recurring_revenue
    label: Monthly Recurring Revenue
    description: Sum of all active subscription values normalized to monthly
    type: sum
    sql: |
      CASE 
        WHEN billing_period = 'monthly' THEN amount
        WHEN billing_period = 'annual' THEN amount / 12
      END
    table: subscriptions
    filters:
      - column: status
        operator: equals
        value: 'active'
    dimensions:
      - customer_segment
      - plan_type
    timestamp_column: created_at
    
  - name: customer_count
    label: Active Customers
    description: Count of distinct active customer IDs
    type: count_distinct
    sql: customer_id
    table: subscriptions
    filters:
      - column: status
        operator: equals
        value: 'active'
创建
semantic-layer/warehouse/metrics.yaml
yaml
metrics:
  - name: monthly_recurring_revenue
    label: Monthly Recurring Revenue
    description: Sum of all active subscription values normalized to monthly
    type: sum
    sql: |
      CASE 
        WHEN billing_period = 'monthly' THEN amount
        WHEN billing_period = 'annual' THEN amount / 12
      END
    table: subscriptions
    filters:
      - column: status
        operator: equals
        value: 'active'
    dimensions:
      - customer_segment
      - plan_type
    timestamp_column: created_at
    
  - name: customer_count
    label: Active Customers
    description: Count of distinct active customer IDs
    type: count_distinct
    sql: customer_id
    table: subscriptions
    filters:
      - column: status
        operator: equals
        value: 'active'

Defining Dimensions

定义维度

Create
semantic-layer/warehouse/dimensions.yaml
:
yaml
dimensions:
  - name: customer_segment
    label: Customer Segment
    description: Business vs. enterprise customer classification
    type: categorical
    sql: |
      CASE 
        WHEN annual_revenue > 100000 THEN 'Enterprise'
        WHEN annual_revenue > 10000 THEN 'Business'
        ELSE 'Startup'
      END
    table: customers
    
  - name: signup_date
    label: Signup Date
    description: Date customer first signed up
    type: time
    sql: DATE(created_at)
    table: customers
    granularities:
      - day
      - week
      - month
      - quarter
      - year
创建
semantic-layer/warehouse/dimensions.yaml
yaml
dimensions:
  - name: customer_segment
    label: Customer Segment
    description: Business vs. enterprise customer classification
    type: categorical
    sql: |
      CASE 
        WHEN annual_revenue > 100000 THEN 'Enterprise'
        WHEN annual_revenue > 10000 THEN 'Business'
        ELSE 'Startup'
      END
    table: customers
    
  - name: signup_date
    label: Signup Date
    description: Date customer first signed up
    type: time
    sql: DATE(created_at)
    table: customers
    granularities:
      - day
      - week
      - month
      - quarter
      - year

Join Configuration

关联配置

Define table relationships in
semantic-layer/warehouse/joins.yaml
:
yaml
joins:
  - left_table: subscriptions
    right_table: customers
    type: left
    conditions:
      - left_column: customer_id
        right_column: id
    
  - left_table: subscriptions
    right_table: plans
    type: left
    conditions:
      - left_column: plan_id
        right_column: id
semantic-layer/warehouse/joins.yaml
中定义表关系:
yaml
joins:
  - left_table: subscriptions
    right_table: customers
    type: left
    conditions:
      - left_column: customer_id
        right_column: id
    
  - left_table: subscriptions
    right_table: plans
    type: left
    conditions:
      - left_column: plan_id
        right_column: id

Wiki Management

知识库管理

Creating Wiki Pages

创建知识库页面

bash
undefined
bash
undefined

Create global wiki page

创建全局知识库页面

cat > wiki/global/refund-policy.md << 'EOF'
cat > wiki/global/refund-policy.md << 'EOF'

Refund Policy

Refund Policy

Overview

Overview

Customers can request refunds within 30 days of purchase.
Customers can request refunds within 30 days of purchase.

Rules

Rules

  • Full refund: < 7 days
  • Prorated refund: 7-30 days
  • No refund: > 30 days
  • Full refund: < 7 days
  • Prorated refund: 7-30 days
  • No refund: > 30 days

Database Impact

Database Impact

Refunds update
transactions.status
to 'refunded' and create negative entries in
revenue_events
. EOF
Refunds update
transactions.status
to 'refunded' and create negative entries in
revenue_events
. EOF

Create user-scoped note

创建用户专属笔记

mkdir -p wiki/user/$(whoami) cat > wiki/user/$(whoami)/analysis-notes.md << 'EOF'
mkdir -p wiki/user/$(whoami) cat > wiki/user/$(whoami)/analysis-notes.md << 'EOF'

Analysis Notes

Analysis Notes

2025-05 Revenue Analysis

2025-05 Revenue Analysis

Found discrepancy in EMEA revenue - missing Stripe events. Tracked in JIRA-1234. EOF
undefined
Found discrepancy in EMEA revenue - missing Stripe events. Tracked in JIRA-1234. EOF
undefined

Ingesting from Notion

从Notion摄入内容

yaml
context_sources:
  - id: product_docs
    type: notion
    token: ${NOTION_TOKEN}
    page_ids:
      - 3fa85f64-5717-4562-b3fc-2c963f66afa6  # Product Roadmap
      - 7c9e6679-7425-40de-944b-e07fc1f90ae7  # Data Dictionary
    recursive: true  # Include child pages
Run
ktx ingest
to sync Notion content into
wiki/global/
.
yaml
context_sources:
  - id: product_docs
    type: notion
    token: ${NOTION_TOKEN}
    page_ids:
      - 3fa85f64-5717-4562-b3fc-2c963f66afa6  # Product Roadmap
      - 7c9e6679-7425-40de-944b-e07fc1f90ae7  # Data Dictionary
    recursive: true  # 包含子页面
运行
ktx ingest
将Notion内容同步到
wiki/global/

Agent Integration (MCP)

代理集成(MCP)

Starting MCP Server

启动MCP服务器

bash
undefined
bash
undefined

In your ktx project directory

在你的ktx项目目录中

ktx mcp start
ktx mcp start

Or specify project location

或者指定项目位置

ktx mcp start --project-dir ~/my-analytics
undefined
ktx mcp start --project-dir ~/my-analytics
undefined

Claude Desktop Configuration

Claude Desktop配置

Add to
~/Library/Application Support/Claude/claude_desktop_config.json
:
json
{
  "mcpServers": {
    "ktx": {
      "command": "ktx",
      "args": ["mcp", "start", "--project-dir", "/absolute/path/to/project"]
    }
  }
}
Restart Claude Desktop to load the MCP server.
将以下内容添加到
~/Library/Application Support/Claude/claude_desktop_config.json
json
{
  "mcpServers": {
    "ktx": {
      "command": "ktx",
      "args": ["mcp", "start", "--project-dir", "/absolute/path/to/project"]
    }
  }
}
重启Claude Desktop以加载MCP服务器。

Using ktx from Claude

在Claude中使用ktx

Once configured, you can prompt Claude:
What was our MRR last month by customer segment?
Claude will use ktx MCP tools to:
  1. Search semantic layer for
    monthly_recurring_revenue
    metric
  2. Find relevant dimensions (
    customer_segment
    )
  3. Retrieve approved SQL definitions
  4. Execute query using canonical metric logic
配置完成后,你可以向Claude发送提示:
What was our MRR last month by customer segment?
Claude会使用ktx MCP工具完成以下操作:
  1. 在语义层中搜索
    monthly_recurring_revenue
    指标
  2. 找到相关维度(
    customer_segment
  3. 获取已核准的SQL定义
  4. 使用标准指标逻辑执行查询

Available MCP Tools

可用的MCP工具

  • ktx_search_semantic_layer
    - Search metrics, dimensions, tables
  • ktx_search_wiki
    - Search business knowledge and documentation
  • ktx_describe_entity
    - Get detailed entity information
  • ktx_list_connections
    - List available database connections
  • ktx_get_context_summary
    - Get project context overview
  • ktx_search_semantic_layer
    - 搜索指标、维度、表
  • ktx_search_wiki
    - 搜索业务知识和文档
  • ktx_describe_entity
    - 获取实体详细信息
  • ktx_list_connections
    - 列出可用数据库连接
  • ktx_get_context_summary
    - 获取项目上下文概述

Common Patterns

常见模式

Initial Project Setup Workflow

初始项目设置流程

bash
undefined
bash
undefined

1. Navigate to analytics project

1. 导航到分析项目

cd ~/projects/analytics
cd ~/projects/analytics

2. Run setup wizard

2. 运行设置向导

ktx setup
ktx setup

Select: Anthropic Claude, OpenAI embeddings, configure Postgres connection

选择:Anthropic Claude、OpenAI嵌入模型、配置Postgres连接

3. Add dbt context source

3. 添加dbt上下文来源

Edit ktx.yaml to add dbt manifest/catalog paths

编辑ktx.yaml添加dbt manifest/catalog路径

4. Build context

4. 构建上下文

ktx ingest
ktx ingest

5. Verify

5. 验证

ktx status ktx sl "revenue"
ktx status ktx sl "revenue"

6. Start MCP for agents

6. 为代理启动MCP

ktx mcp start
undefined
ktx mcp start
undefined

Incremental Context Updates

增量上下文更新

bash
undefined
bash
undefined

After dbt run or schema changes

在dbt运行或 schema变更后

dbt run dbt docs generate ktx ingest --connection warehouse
dbt run dbt docs generate ktx ingest --connection warehouse

After updating wiki pages

在更新知识库页面后

ktx ingest --source notion_docs
ktx ingest --source notion_docs

Check for conflicts or issues

检查冲突或问题

ktx validate
undefined
ktx validate
undefined

Searching Before Agent Queries

在代理查询前进行搜索

bash
undefined
bash
undefined

Find available metrics

查找可用指标

ktx sl "churn"
ktx sl "churn"

Output:

输出:

Metrics:

Metrics:

- customer_churn_rate (Monthly customer churn percentage)

- customer_churn_rate (Monthly customer churn percentage)

- mrr_churn (Monthly recurring revenue lost to churn)

- mrr_churn (Monthly recurring revenue lost to churn)

Dimensions:

Dimensions:

- churn_reason (Categorical reason for cancellation)

- churn_reason (Categorical reason for cancellation)

Get metric details

获取指标详情

ktx describe metric customer_churn_rate
ktx describe metric customer_churn_rate

Now prompt agent with context:

现在向代理发送带上下文的提示:

"Calculate customer_churn_rate for Q1 2025 by churn_reason"

"Calculate customer_churn_rate for Q1 2025 by churn_reason"

undefined
undefined

Multi-Warehouse Setup

多仓库设置

yaml
connections:
  - id: production
    type: snowflake
    account: ${SNOWFLAKE_PROD_ACCOUNT}
    # ... prod credentials
    
  - id: staging
    type: snowflake
    account: ${SNOWFLAKE_STAGING_ACCOUNT}
    # ... staging credentials

context_sources:
  - id: dbt_prod
    type: dbt
    connection_id: production
    manifest_path: ./prod/target/manifest.json
    
  - id: dbt_staging
    type: dbt
    connection_id: staging
    manifest_path: ./staging/target/manifest.json
bash
undefined
yaml
connections:
  - id: production
    type: snowflake
    account: ${SNOWFLAKE_PROD_ACCOUNT}
    # ... 生产环境凭证
    
  - id: staging
    type: snowflake
    account: ${SNOWFLAKE_STAGING_ACCOUNT}
    # ... 预发布环境凭证

context_sources:
  - id: dbt_prod
    type: dbt
    connection_id: production
    manifest_path: ./prod/target/manifest.json
    
  - id: dbt_staging
    type: dbt
    connection_id: staging
    manifest_path: ./staging/target/manifest.json
bash
undefined

Ingest specific warehouse

摄入特定仓库

ktx ingest --connection production
ktx ingest --connection production

Search scoped to connection

按连接范围搜索

ktx sl "revenue" --connection production
undefined
ktx sl "revenue" --connection production
undefined

Troubleshooting

故障排除

MCP Server Not Starting

MCP服务器无法启动

Check project directory:
bash
ktx status
检查项目目录:
bash
ktx status

If shows "ktx mcp start --project-dir ...", copy and run that command

如果显示 "ktx mcp start --project-dir ...",复制并运行该命令


**Verify ktx.yaml exists:**
```bash
ls ktx.yaml

**验证ktx.yaml是否存在:**
```bash
ls ktx.yaml

If missing, run: ktx setup

如果缺失,运行:ktx setup


**Check Claude Desktop config:**
```bash
cat ~/Library/Application\ Support/Claude/claude_desktop_config.json

**检查Claude Desktop配置:**
```bash
cat ~/Library/Application\ Support/Claude/claude_desktop_config.json

Ensure path is absolute, not relative

确保路径是绝对路径,而非相对路径

undefined
undefined

Database Connection Issues

数据库连接问题

bash
undefined
bash
undefined

Test connection directly

直接测试连接

ktx connections test warehouse
ktx connections test warehouse

Common fixes:

常见修复方法:

- Verify environment variables are set

- 验证环境变量已设置

- Check firewall/VPN for warehouse access

- 检查防火墙/VPN是否允许访问仓库

- Ensure user has SELECT permissions

- 确保用户拥有SELECT权限

- For Snowflake, verify warehouse is running

- 对于Snowflake,验证仓库正在运行

undefined
undefined

Context Ingestion Failures

上下文摄入失败

bash
undefined
bash
undefined

Enable verbose logging

启用详细日志

ktx ingest --verbose
ktx ingest --verbose

Check specific source

检查特定来源

ktx ingest --source dbt_main --verbose
ktx ingest --source dbt_main --verbose

Validate configuration

验证配置

ktx validate
ktx validate

Common issues:

常见问题:

- dbt manifest/catalog paths incorrect

- dbt manifest/catalog路径不正确

- Missing environment variables

- 缺失环境变量

- LLM API rate limits (retry with backoff)

- LLM API速率限制(重试并使用退避策略)

undefined
undefined

Semantic Layer Errors

语义层错误

Undefined metric:
bash
undefined
未定义指标:
bash
undefined

List all metrics

列出所有指标

ktx sl "*" --type metric
ktx sl "*" --type metric

Ensure YAML is valid

确保YAML格式有效

cat semantic-layer/warehouse/metrics.yaml ktx validate

**Join graph errors:**
```bash
cat semantic-layer/warehouse/metrics.yaml ktx validate

**关联图错误:**
```bash

Check join definitions

检查关联定义

cat semantic-layer/warehouse/joins.yaml
cat semantic-layer/warehouse/joins.yaml

Common issues:

常见问题:

- Missing join between tables

- 表之间缺失关联

- Ambiguous join paths (fan trap)

- 关联路径不明确(扇出陷阱)

- Incorrect column names

- 列名不正确

undefined
undefined

Agent Not Finding Context

代理无法找到上下文

Restart MCP server:
bash
ktx mcp stop
ktx mcp start
Rebuild context:
bash
ktx ingest --force
Check search results:
bash
ktx sl "your search term"
ktx wiki "your search term"
重启MCP服务器:
bash
ktx mcp stop
ktx mcp start
重建上下文:
bash
ktx ingest --force
检查搜索结果:
bash
ktx sl "your search term"
ktx wiki "your search term"

If empty, context may not have been ingested

如果为空,可能上下文未被摄入

undefined
undefined

TypeScript API Usage

TypeScript API使用

For programmatic usage in Node.js:
typescript
import { KtxProject } from '@kaelio/ktx';

// Load project
const project = await KtxProject.load('/path/to/project');

// Search semantic layer
const metrics = await project.searchSemanticLayer('revenue', {
  type: 'metric',
  limit: 10
});

// Search wiki
const wikiPages = await project.searchWiki('refund policy', {
  scope: 'global',
  limit: 5
});

// Get entity details
const metric = await project.describeEntity('metric', 'monthly_recurring_revenue');

// Execute ingestion
await project.ingest({
  connectionId: 'warehouse',
  force: false
});
在Node.js中以编程方式使用:
typescript
import { KtxProject } from '@kaelio/ktx';

// 加载项目
const project = await KtxProject.load('/path/to/project');

// 搜索语义层
const metrics = await project.searchSemanticLayer('revenue', {
  type: 'metric',
  limit: 10
});

// 搜索知识库
const wikiPages = await project.searchWiki('refund policy', {
  scope: 'global',
  limit: 5
});

// 获取实体详情
const metric = await project.describeEntity('metric', 'monthly_recurring_revenue');

// 执行摄入
await project.ingest({
  connectionId: 'warehouse',
  force: false
});

Best Practices

最佳实践

  1. Version control semantic layer: Commit
    semantic-layer/
    and
    wiki/global/
    to track metric definitions
  2. Use environment variables: Never hardcode credentials in
    ktx.yaml
  3. Regular ingestion: Run
    ktx ingest
    after dbt runs or schema changes
  4. Descriptive names: Use clear metric/dimension names that match business language
  5. Document assumptions: Add descriptions to all semantic layer entities
  6. Test connections: Use
    ktx connections test
    before ingestion
  7. User wiki for ephemeral notes: Keep temporary analysis in
    wiki/user/
  8. Review ingest logs: Check for warnings about duplicate or conflicting definitions
  1. 版本控制语义层:
    semantic-layer/
    wiki/global/
    提交到版本控制,以跟踪指标定义
  2. 使用环境变量: 切勿在
    ktx.yaml
    中硬编码凭证
  3. 定期摄入: 在dbt运行或schema变更后运行
    ktx ingest
  4. 使用描述性名称: 使用与业务语言匹配的清晰指标/维度名称
  5. 记录假设: 为所有语义层实体添加描述
  6. 测试连接: 在摄入前使用
    ktx connections test
  7. 用户知识库用于临时笔记: 将临时分析内容保存在
    wiki/user/
  8. 查看摄入日志: 检查关于重复或冲突定义的警告

Advanced Configuration

高级配置

Custom LLM Configuration

自定义LLM配置

yaml
llm:
  provider: vertex
  project_id: ${GCP_PROJECT_ID}
  location: us-central1
  model: claude-3-5-sonnet@20241022
  max_tokens: 8192
  temperature: 0.0
yaml
llm:
  provider: vertex
  project_id: ${GCP_PROJECT_ID}
  location: us-central1
  model: claude-3-5-sonnet@20241022
  max_tokens: 8192
  temperature: 0.0

Embedding Configuration

嵌入模型配置

yaml
embeddings:
  provider: openai
  model: text-embedding-3-large
  dimensions: 1536
  batch_size: 100
yaml
embeddings:
  provider: openai
  model: text-embedding-3-large
  dimensions: 1536
  batch_size: 100

Ingest Scheduling

摄入调度

bash
undefined
bash
undefined

Cron example: Daily at 2 AM

Cron示例:每天凌晨2点

0 2 * * * cd /path/to/project && ktx ingest --connection warehouse >> /var/log/ktx.log 2>&1
undefined
0 2 * * * cd /path/to/project && ktx ingest --connection warehouse >> /var/log/ktx.log 2>&1
undefined

Custom Python Queries

自定义Python查询

ktx includes a Python semantic layer query planner:
python
from ktx_sl import SemanticLayer
ktx包含一个Python语义层查询规划器:
python
from ktx_sl import SemanticLayer

Load semantic layer

加载语义层

sl = SemanticLayer.from_project("/path/to/project")
sl = SemanticLayer.from_project("/path/to/project")

Build query

构建查询

query = sl.query( metrics=["monthly_recurring_revenue"], dimensions=["customer_segment", "signup_month"], filters=[ {"dimension": "signup_date", "operator": ">=", "value": "2025-01-01"} ], order_by=[{"metric": "monthly_recurring_revenue", "desc": True}] )
query = sl.query( metrics=["monthly_recurring_revenue"], dimensions=["customer_segment", "signup_month"], filters=[ {"dimension": "signup_date", "operator": ">=", "value": "2025-01-01"} ], order_by=[{"metric": "monthly_recurring_revenue", "desc": True}] )

Get SQL

获取SQL

sql = query.to_sql() print(sql)
undefined
sql = query.to_sql() print(sql)
undefined

Resources

资源