cdm-query
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCDM Query - Unified Database Query Interface
CDM Query - 统一数据库查询接口
Intelligent query interface that automatically selects the optimal approach for querying the KBase CDM DuckDB database.
用于KBase CDM DuckDB数据库的智能查询接口,可自动选择最优查询方式。
When to Use This Skill
何时使用该Skill
Use this skill for any CDM database query needs:
- Natural language questions about the data
- Schema exploration and documentation
- Simple counts and statistics
- Complex joins and relationships
- Data analysis and exploration
This is the recommended default skill for all CDM database interactions.
该Skill适用于所有CDM数据库查询需求:
- 关于数据的自然语言提问
- Schema探索与文档查看
- 简单计数与统计
- 复杂关联查询与关系分析
- 数据分析与探索
这是所有CDM数据库交互的推荐默认Skill。
How It Works
工作原理
The skill automatically chooses the best approach:
该Skill会自动选择最优查询方式:
Fast Path (Simple Queries)
快速路径(简单查询)
For straightforward queries, uses direct SQL translation:
- Single table queries
- Basic counts and statistics
- Simple filters
- Quick lookups
对于简单查询,使用直接SQL翻译:
- 单表查询
- 基础计数与统计
- 简单过滤
- 快速查找
Schema-Aware Path (Complex Queries)
基于Schema的路径(复杂查询)
For sophisticated queries, uses LinkML schema context:
- Multi-table joins
- Relationship navigation
- Schema exploration
- Complex aggregations
- Queries requiring data model understanding
对于复杂查询,使用LinkML Schema上下文:
- 多表关联查询
- 关系导航
- Schema探索
- 复杂聚合
- 需要理解数据模型的查询
Auto-Detection
自动检测
The skill analyzes your question and automatically picks the optimal approach. You don't need to think about it!
该Skill会分析你的问题并自动选择最优方式,无需用户手动选择!
Prerequisites
前置条件
Option 1: With API Key (Full Features)
- Database loaded: or similar
cdm_store_sample.db - API Key: environment variable set
ANTHROPIC_API_KEY - Enables natural language query translation
Option 2: Without API Key (Claude Code Only)
- Database loaded: or similar
cdm_store_sample.db - Use this skill in Claude Code conversations
- Claude directly translates and executes queries
选项1:使用API密钥(完整功能)
- 已加载数据库:或类似数据库
cdm_store_sample.db - API密钥:已设置环境变量
ANTHROPIC_API_KEY - 支持自然语言查询翻译
选项2:不使用API密钥(仅Claude Code)
- 已加载数据库:或类似数据库
cdm_store_sample.db - 在Claude Code对话中使用该Skill
- Claude直接翻译并执行查询
Usage
使用方法
In Claude Code (Recommended)
在Claude Code中使用(推荐)
Simply invoke the skill:
/cdm-queryThen ask your question:
- "How many samples are there?"
- "Show me samples with their location information"
- "What fields does the Sample table have?"
- "Find assemblies with read data"
直接调用该Skill:
/cdm-query然后提出你的问题:
- "有多少个样本?"
- "展示样本及其位置信息"
- "Sample表有哪些字段?"
- "查找包含读取数据的组装记录"
Using Just Commands
使用命令行
bash
undefinedbash
undefinedSimple query (fast path)
简单查询(快速路径)
just cdm-query "How many samples are there?"
just cdm-query "有多少个样本?"
Complex query (schema-aware path)
复杂查询(基于Schema的路径)
just cdm-query "Find samples with their locations and read counts"
just cdm-query "查找样本及其位置和读取计数"
Schema exploration
Schema探索
just cdm-query-schema Sample
just cdm-query-info
just cdm-query-schema Sample
just cdm-query-info
JSON output
JSON格式输出
just cdm-query-json "Count samples by location"
undefinedjust cdm-query-json "按位置统计样本数量"
undefinedDirect Python
直接使用Python
bash
uv run python scripts/cdm_analysis/cdm_unified_query.py \
--db cdm_store_sample.db \
"your question"bash
uv run python scripts/cdm_analysis/cdm_unified_query.py \
--db cdm_store_sample.db \
"你的问题"Query Examples
查询示例
Simple Queries (Fast Path Used)
简单查询(使用快速路径)
Counts and Statistics:
- "How many samples are in the database?"
- "Count all reads"
- "Show total assemblies"
Single Table Filters:
- "Find samples with depth > 100"
- "List reads with read_count > 50000"
- "Show assemblies from 2021"
Basic Lookups:
- "Get sample Sample0000001"
- "Show location Location0000001"
- "Display the first 10 samples"
计数与统计:
- "数据库中有多少个样本?"
- "统计所有读取记录"
- "展示总组装数"
单表过滤:
- "查找深度大于100的样本"
- "列出读取计数大于50000的读取记录"
- "展示2021年的组装记录"
基础查找:
- "获取样本Sample0000001"
- "展示位置Location0000001"
- "显示前10个样本"
Complex Queries (Schema-Aware Path Used)
复杂查询(使用基于Schema的路径)
Joins and Relationships:
- "Find samples WITH their location information"
- "Show reads WITH their corresponding samples"
- "List assemblies WITH their read data"
Multi-Hop Relationships:
- "Trace the pipeline: sample → reads → assembly"
- "Find locations with their samples and read counts"
- "Show genomes with their source samples"
Aggregations Across Tables:
- "Count samples per location with location names"
- "Show average read count per sample material"
- "List assemblies grouped by source location"
关联查询与关系分析:
- "查找样本及其位置信息"
- "展示读取记录及其对应的样本"
- "列出组装记录及其读取数据"
多跳关系查询:
- "追踪流程:样本→读取记录→组装"
- "查找包含样本和读取计数的位置"
- "展示基因组及其来源样本"
跨表聚合:
- "按位置统计样本数量并显示位置名称"
- "展示每个样本材料的平均读取计数"
- "按来源位置分组列出组装记录"
Schema Exploration
Schema探索
Class Information:
- "What is the Sample class?"
- "Explain the Reads entity"
- "Show me all classes in the schema"
Relationship Discovery:
- "How is Sample related to Location?"
- "What tables link to Reads?"
- "Show the provenance chain for assemblies"
Query Suggestions:
- "What can I query?"
- "Give me interesting query ideas"
- "What relationships exist?"
类信息:
- "Sample类是什么?"
- "解释Reads实体"
- "展示Schema中的所有类"
关系发现:
- "Sample与Location有什么关系?"
- "哪些表与Reads关联?"
- "展示组装记录的溯源链"
查询建议:
- "我可以查询什么内容?"
- "给我一些有趣的查询思路"
- "存在哪些关系?"
Command Reference
命令参考
Query Commands
查询命令
bash
undefinedbash
undefinedBasic query (auto-detects complexity)
基础查询(自动检测复杂度)
just cdm-query "your question"
just cdm-query "你的问题"
Force fast path
强制使用快速路径
just cdm-query-fast "simple question"
just cdm-query-fast "简单问题"
Force schema-aware path
强制使用基于Schema的路径
just cdm-query-schema-aware "complex question"
just cdm-query-schema-aware "复杂问题"
JSON output
JSON格式输出
just cdm-query-json "your question"
just cdm-query-json "你的问题"
Verbose (show SQL and strategy)
详细模式(显示SQL和策略)
just cdm-query-verbose "your question"
undefinedjust cdm-query-verbose "你的问题"
undefinedSchema Exploration
Schema探索
bash
undefinedbash
undefinedShow schema overview
展示Schema概览
just cdm-query-info
just cdm-query-info
Explore specific class
探索特定类
just cdm-query-explore Sample
just cdm-query-explore Sample
Get query suggestions
获取查询建议
just cdm-query-suggest
just cdm-query-suggest
Show relationships
展示关系
just cdm-query-relationships
undefinedjust cdm-query-relationships
undefinedSmart Auto-Detection
智能自动检测
The skill analyzes your query for:
Indicators of Simple Query (Fast Path):
- Keywords: "count", "how many", "total"
- Single entity mentioned
- No relationship words ("with", "and their")
- Basic filters (>, <, =)
- No JOINs implied
Indicators of Complex Query (Schema-Aware Path):
- Multiple entities mentioned
- Relationship words: "with", "and their", "related to"
- Cross-table aggregations
- Provenance terms: "pipeline", "lineage", "trace"
- Schema questions: "what is", "explain", "how is...related"
Example Auto-Detection:
| Query | Path Used | Why |
|---|---|---|
| "How many samples?" | Fast | Single table count |
| "Find samples with depth > 100" | Fast | Single table filter |
| "Samples WITH locations" | Schema-aware | JOIN implied |
| "What is Sample class?" | Schema-aware | Schema exploration |
| "Trace sample → assembly" | Schema-aware | Multi-hop relationship |
该Skill会分析你的查询,识别以下特征:
简单查询(快速路径)的特征:
- 关键词:"count"、"多少"、"总数"
- 仅提及单个实体
- 无关系类词汇("及其"、"和")
- 基础过滤条件(>, <, =)
- 无隐含关联查询
复杂查询(基于Schema的路径)的特征:
- 提及多个实体
- 关系类词汇:"及其"、"和"、"关联"
- 跨表聚合
- 溯源类术语:"流程"、"谱系"、"追踪"
- Schema相关问题:"是什么"、"解释"、"如何关联"
自动检测示例:
| 查询语句 | 使用路径 | 原因 |
|---|---|---|
| "有多少个样本?" | 快速路径 | 单表计数 |
| "查找深度大于100的样本" | 快速路径 | 单表过滤 |
| "样本及其位置信息" | 基于Schema的路径 | 隐含关联查询 |
| "Sample类是什么?" | 基于Schema的路径 | Schema探索 |
| "追踪样本→组装的流程" | 基于Schema的路径 | 多跳关系查询 |
Implementation Strategy
实现策略
When you invoke this skill, Claude will:
-
Analyze the Question
- Identify entities mentioned
- Detect relationship keywords
- Check for schema questions
-
Choose Strategy
IF simple_query: Use fast SQL translation Execute with minimal context ELSE IF complex_query: Load LinkML schema Use schema-aware translation Generate intelligent JOINs ELSE IF schema_question: Use schema exploration tools Return documentation -
Execute Query
- Generate appropriate SQL
- Run against DuckDB
- Format results
-
Return Results
- Display data in clean format
- Show SQL if verbose mode
- Suggest related queries
当你调用该Skill时,Claude会:
-
分析问题
- 识别提及的实体
- 检测关系关键词
- 检查是否为Schema相关问题
-
选择策略
IF 简单查询: 使用快速SQL翻译 以最小上下文执行 ELSE IF 复杂查询: 加载LinkML Schema 使用基于Schema的翻译 生成智能关联查询 ELSE IF Schema问题: 使用Schema探索工具 返回文档内容 -
执行查询
- 生成对应的SQL语句
- 在DuckDB中运行
- 格式化结果
-
返回结果
- 以清晰格式展示数据
- 详细模式下显示SQL语句
- 建议相关查询
Error Handling
错误处理
If Fast Path Fails:
- Automatically retry with schema-aware path
- Schema context may resolve ambiguities
If Query is Ambiguous:
- Ask clarifying questions
- Suggest similar valid queries
If Results are Empty:
- Verify data exists
- Suggest alternative queries
- Check for typos in entity names
如果快速路径执行失败:
- 自动重试基于Schema的路径
- Schema上下文可能解决歧义
如果查询存在歧义:
- 提出澄清问题
- 建议类似的有效查询
如果结果为空:
- 验证数据是否存在
- 建议替代查询
- 检查实体名称是否有拼写错误
Performance Notes
性能说明
Fast Path:
- ⚡ ~2-3 seconds
- Minimal context (~1-2KB)
- Best for simple queries
Schema-Aware Path:
- 🐢 ~3-5 seconds
- Rich context (~10-12KB)
- Best for complex queries
The skill optimizes for speed by using the fast path when possible, but switches to schema-aware when needed for accuracy.
快速路径:
- ⚡ 约2-3秒
- 最小上下文(约1-2KB)
- 最适合简单查询
基于Schema的路径:
- 🐢 约3-5秒
- 丰富上下文(约10-12KB)
- 最适合复杂查询
该Skill优先优化速度,尽可能使用快速路径,必要时切换到基于Schema的路径以保证准确性。
Comparison with Individual Skills
与单一功能Skill的对比
| Feature | /cdm-query (unified) | /nl-sql-query | /schema-query |
|---|---|---|---|
| Auto-optimization | ✓ Yes | ✗ No | ✗ No |
| Fast simple queries | ✓ Yes | ✓ Yes | ✗ Slower |
| Complex joins | ✓ Yes | ~ Basic | ✓ Yes |
| Schema awareness | ✓ Auto | ✗ No | ✓ Yes |
| Schema exploration | ✓ Yes | ✗ No | ✓ Yes |
| User choice needed | ✗ No | ✓ Yes | ✓ Yes |
Recommendation: Use as your default. It combines the best of both approaches.
/cdm-query| 功能 | /cdm-query(统一接口) | /nl-sql-query | /schema-query |
|---|---|---|---|
| 自动优化 | ✓ 是 | ✗ 否 | ✗ 否 |
| 快速简单查询 | ✓ 是 | ✓ 是 | ✗ 较慢 |
| 复杂关联查询 | ✓ 是 | ~ 基础支持 | ✓ 是 |
| Schema感知 | ✓ 自动 | ✗ 否 | ✓ 是 |
| Schema探索 | ✓ 是 | ✗ 否 | ✓ 是 |
| 需要用户选择 | ✗ 否 | ✓ 是 | ✓ 是 |
建议:将作为默认工具,它结合了两种路径的优势。
/cdm-queryTips for Best Results
使用技巧
-
Be Specific
- Good: "Find samples with depth > 100"
- Better: "Find samples where depth_meter > 100"
-
Use Relationship Words for Joins
- Use "WITH" for joins: "samples WITH their locations"
- Use "AND" for multiple tables: "samples AND their reads"
-
Ask for Schema Help
- "What fields does Sample have?"
- "How do I query locations?"
- "Show me the data model"
-
Start Simple, Then Refine
- First: "How many samples?"
- Then: "Show me those samples"
- Finally: "Add their location information"
-
尽量具体
- 一般:"查找深度大于100的样本"
- 更好:"查找depth_meter大于100的样本"
-
使用关系词汇触发关联查询
- 使用"及其"表示关联:"样本及其位置信息"
- 使用"和"表示多表:"样本和其读取记录"
-
询问Schema相关帮助
- "Sample有哪些字段?"
- "如何查询位置信息?"
- "展示数据模型"
-
从简单到复杂逐步细化
- 第一步:"有多少个样本?"
- 第二步:"展示这些样本"
- 第三步:"添加它们的位置信息"
Advanced Features
高级功能
Query History
查询历史
Maintains context of previous queries in conversation for follow-up questions.
在对话中保留之前的查询上下文,支持跟进提问。
Smart Suggestions
智能建议
After each query, suggests related queries you might want to run.
每次查询后,推荐相关的查询内容。
Automatic Optimization
自动优化
Learns from failed queries and adjusts strategy automatically.
从失败的查询中学习,自动调整策略。
Multi-Format Output
多格式输出
- Text tables (default)
- JSON (--json flag)
- CSV (--csv flag)
- Markdown (--markdown flag)
- 文本表格(默认)
- JSON(--json参数)
- CSV(--csv参数)
- Markdown(--markdown参数)
Troubleshooting
故障排除
Issue: "Table not found"
- Run: to see available tables
just cdm-store-stats - Check table naming: sdt_sample not Sample
Issue: Query too slow
- The skill will automatically use fast path for simple queries
- For complex queries, consider adding filters to limit results
Issue: Unexpected results
- Use to see generated SQL
--verbose - Check if the right strategy was chosen
- Try rephrasing your question
问题:"表未找到"
- 运行:查看可用表
just cdm-store-stats - 检查表命名:是sdt_sample而非Sample
问题:查询速度过慢
- 该Skill会自动对简单查询使用快速路径
- 对于复杂查询,考虑添加过滤条件限制结果数量
问题:结果不符合预期
- 使用查看生成的SQL语句
--verbose - 检查是否选择了正确的策略
- 尝试重新表述问题
Related Commands
相关命令
bash
undefinedbash
undefinedDatabase management
数据库管理
just cdm-store-stats # Show what's in database
just load-cdm-store-sample # Load sample database
just cdm-store-stats # 展示数据库内容
just load-cdm-store-sample # 加载样本数据库
Direct query interfaces (if you want manual control)
直接查询接口(手动控制)
just cdm-nl-query "question" # Force fast path
just cdm-schema-query "question" # Force schema-aware path
just cdm-nl-query "问题" # 强制使用快速路径
just cdm-schema-query "问题" # 强制使用基于Schema的路径
Data exploration
数据探索
just cdm-find-samples Location0000001
just cdm-search-oterm "soil"
just cdm-lineage Assembly Assembly0000001
undefinedjust cdm-find-samples Location0000001
just cdm-search-oterm "soil"
just cdm-lineage Assembly Assembly0000001
undefinedTechnical Details
技术细节
Backend:
- Python DuckDB 1.4.1
- LinkML Runtime 1.9.4+
- Anthropic Claude API (optional)
Database:
- DuckDB format
- CDM naming convention (sdt_, sys_, ddt_*)
- Full LinkML schema awareness
Strategies:
- Fast SQL: Direct translation, minimal context
- Schema-aware: LinkML schema context, intelligent JOINs
- Hybrid: Uses both as needed
Scripts:
- (fast path)
scripts/cdm_analysis/nl_sql_query.py - (schema-aware path)
scripts/cdm_analysis/schema_aware_query.py - (unified interface)
scripts/cdm_analysis/cdm_unified_query.py
后端:
- Python DuckDB 1.4.1
- LinkML Runtime 1.9.4+
- Anthropic Claude API(可选)
数据库:
- DuckDB格式
- CDM命名规范(sdt_, sys_, ddt_*)
- 完整LinkML Schema感知
策略:
- 快速SQL:直接翻译,最小上下文
- 基于Schema:LinkML Schema上下文,智能关联查询
- 混合:根据需要结合两种方式
脚本:
- (快速路径)
scripts/cdm_analysis/nl_sql_query.py - (基于Schema的路径)
scripts/cdm_analysis/schema_aware_query.py - (统一接口)
scripts/cdm_analysis/cdm_unified_query.py
Future Enhancements
未来规划
Planned features:
- Query result caching
- Automatic query optimization suggestions
- Visual query builder integration
- Export to multiple formats
- Saved query templates
- Query performance analysis
计划新增功能:
- 查询结果缓存
- 自动查询优化建议
- 可视化查询构建器集成
- 多格式导出
- 保存查询模板
- 查询性能分析