rhino-sdk
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRhino Health SDK — Workflow Planner & Code Expert
Rhino Health SDK — 工作流规划师与代码专家
Plan-first skill for the Python SDK (v2.1.x). Takes high-level research and analytics goals, decomposes them into phased execution plans, and generates complete runnable Python code.
rhino-health面向 Python SDK(v2.1.x)的先规划再执行工具。它能接收高层级的研究与分析目标,将其分解为分阶段执行计划,并生成完整的可运行Python代码。
rhino-healthContext Loading
上下文加载
Before responding, read ALL reference files — planning requires the full SDK picture:
-
API Reference —Endpoint classes, methods, enums, CreateInput summaries, dataclass fields, import paths.
references/sdk_reference.md -
Patterns & Gotchas —Auth patterns, resource lookup, metrics execution, filtering, code objects, async, and pitfalls.
references/patterns_and_gotchas.md -
Metrics Reference —All 40+ federated metric classes with parameters, import paths, and decision guide.
references/metrics_reference.md -
Example Index —Mapping of use cases to working example files with key methods and difficulty levels.
references/examples/INDEX.md
For SDK questions that don't require planning, you may selectively load only the relevant files.
在回复之前,请阅读所有参考文件——规划需要全面了解SDK的相关信息:
-
API参考 —端点类、方法、枚举、CreateInput摘要、数据类字段、导入路径。
references/sdk_reference.md -
模式与注意事项 —认证模式、资源查找、指标执行、过滤、代码对象、异步操作以及常见陷阱。
references/patterns_and_gotchas.md -
指标参考 —所有40余种联邦指标类,包含参数、导入路径和决策指南。
references/metrics_reference.md -
示例索引 —用例与可用示例文件的映射关系,包含关键方法和难度等级。
references/examples/INDEX.md
对于无需规划的SDK问题,你可以选择性地仅加载相关文件。
Request Routing
请求路由
Determine what the user needs and follow the appropriate workflow:
| User intent | Action |
|---|---|
| High-level goal, multi-step workflow, "plan", "design", "how should I approach" | Full planning workflow (Sections 3-6) |
| "Write code", "generate a script", single-task code generation | Code generation with validation (Section 6) |
| "How do I...", SDK concept question | Answer from reference files (Section 9) |
| Error, traceback, "why is this failing" | Error diagnosis (Section 8) |
| "Which metric for...", metric configuration | Metric selection (Section 7) |
| "Show me an example", "sample code" | Example matching from |
判断用户需求并遵循相应的工作流:
| 用户意图 | 操作 |
|---|---|
| 高层级目标、多步骤工作流、“规划”、“设计”、“我该如何着手” | 完整规划工作流(第3-6节) |
| “编写代码”、“生成脚本”、单任务代码生成 | 带验证的代码生成(第6节) |
| “我该如何...”, SDK概念问题 | 从参考文件中查找答案(第9节) |
| 错误、回溯信息、“为什么这个会失败” | 错误诊断(第8节) |
| “哪种指标适用于...”, 指标配置 | 指标选择(第7节) |
| “给我看示例”、“示例代码” | 从 |
Planning Process
规划流程
Follow these four steps for any multi-step goal:
针对任何多步骤目标,请遵循以下四个步骤:
Step 1: Analyze the Goal
步骤1:分析目标
Extract from the user's request:
- Data: What data sources? Do datasets already exist, or need ingestion/creation?
- Analysis: What computation? Metrics, custom code, harmonization, or a combination?
- Output: What does the user want? Numbers, transformed datasets, trained models, exported files?
- Constraints: Filters (age > 50, gender = F), specific sites, time ranges, target data models (OMOP/FHIR)?
If any of these are unclear, ask the user before producing the plan.
从用户请求中提取以下信息:
- 数据:数据源是什么?数据集是否已存在,还是需要导入/创建?
- 分析:需要进行什么计算?指标计算、自定义代码、数据协调,还是组合操作?
- 输出:用户想要什么结果?数值、转换后的数据集、训练好的模型、导出文件?
- 约束条件:过滤条件(如年龄>50、性别=F)、特定站点、时间范围、目标数据模型(OMOP/FHIR)?
如果以上任何信息不明确,请在生成计划前询问用户。
Step 2: Select Workflow Templates
步骤2:选择工作流模板
Match the goal to one or more composable SDK pipeline templates:
将目标与一个或多个可组合的SDK管道模板匹配:
Template A: Federated Analytics
模板A:联邦分析
Run statistical metrics across one or more sites without moving data.
Auth → Project → Datasets → Metric Config → Execute → Results| Step | SDK Method | Notes |
|---|---|---|
| Authenticate | | Always first |
| Get project | | Check for None |
| Get datasets | | One per site |
| Configure metric | | Add filters/group_by as needed |
| Execute per-site | | Single site |
| Execute aggregated | | Cross-site, |
| Execute joined | | Federated join with shared identifiers |
Use when: descriptive stats, survival analysis, hypothesis tests, or any metric-based analysis.
在不移动数据的情况下,跨一个或多个站点运行统计指标。
认证 → 项目 → 数据集 → 指标配置 → 执行 → 结果| 步骤 | SDK方法 | 说明 |
|---|---|---|
| 认证 | | 始终是第一步 |
| 获取项目 | | 检查是否为None |
| 获取数据集 | | 每个站点一个数据集 |
| 配置指标 | | 根据需要添加过滤/分组条件 |
| 单站点执行 | | 单个站点 |
| 聚合执行 | | 跨站点,接收UID的 |
| 联合执行 | | 带共享标识符的联邦连接 |
适用场景: 描述性统计、生存分析、假设检验或任何基于指标的分析。
Template B: Code Object Execution
模板B:代码对象执行
Run custom containerized or Python code across federated sites.
Auth → Project → Data Schema → Code Object Create → Build → Run → Wait → Output Datasets| Step | SDK Method | Notes |
|---|---|---|
| Authenticate | | |
| Get/create project | | |
| Get/create schema | | Only if new data format |
| Create code object | | |
| Wait for build | | Only for |
| Run | | |
| Wait for completion | | |
| Access outputs | | Triply nested |
Use when: custom computation — train/test splits, feature engineering, model training, any logic that metrics alone cannot express.
跨联邦站点运行自定义容器化或Python代码。
认证 → 项目 → 数据架构 → 代码对象创建 → 构建 → 运行 → 等待 → 输出数据集| 步骤 | SDK方法 | 说明 |
|---|---|---|
| 认证 | | |
| 获取/创建项目 | | |
| 获取/创建架构 | | 仅当需要新数据格式时使用 |
| 创建代码对象 | | |
| 等待构建完成 | | 仅适用于 |
| 运行 | | |
| 等待执行完成 | | |
| 访问输出 | | 三层嵌套 |
适用场景: 自定义计算——训练/测试集划分、特征工程、模型训练,或任何内置指标无法实现的逻辑。
Template C: Data Harmonization
模板C:数据协调
Transform source data into a target data model (OMOP, FHIR, custom).
Auth → Project → Vocabulary → Semantic Mapping → Syntactic Mapping → Config → Run → Output| Step | SDK Method | Notes |
|---|---|---|
| Authenticate | | |
| Get project | | |
| Create semantic mapping | | Optional; for vocabulary lookups |
| Wait for indexing | | Can be slow (minutes) |
| Create syntactic mapping | | Defines column transformations |
| Generate/set config | | LLM-based auto-generation or manual |
| Run harmonization | | Preferred path |
| Wait for completion | | |
| Access outputs | | Triply nested |
Key harmonization types: , , , , , , , .
TransformationType.SPECIFIC_VALUESOURCE_DATA_VALUEROW_PYTHONTABLE_PYTHONSEMANTIC_MAPPINGVLOOKUPDATESECURE_UUIDTarget models: , , .
SyntacticMappingDataModel.OMOP.FHIR.CUSTOMUse when: source data needs transformation before analysis — different column names, value encodings, or target standards like OMOP/FHIR.
将源数据转换为目标数据模型(OMOP、FHIR、自定义模型)。
认证 → 项目 → 词汇表 → 语义映射 → 语法映射 → 配置 → 运行 → 输出| 步骤 | SDK方法 | 说明 |
|---|---|---|
| 认证 | | |
| 获取项目 | | |
| 创建语义映射 | | 可选;用于词汇表查找 |
| 等待索引完成 | | 可能较慢(数分钟) |
| 创建语法映射 | | 定义列转换规则 |
| 生成/设置配置 | | 基于LLM自动生成或手动设置 |
| 运行协调任务 | | 推荐路径 |
| 等待执行完成 | | |
| 访问输出 | | 三层嵌套 |
主要协调类型:、、、、、、、。
TransformationType.SPECIFIC_VALUESOURCE_DATA_VALUEROW_PYTHONTABLE_PYTHONSEMANTIC_MAPPINGVLOOKUPDATESECURE_UUID目标模型:、、。
SyntacticMappingDataModel.OMOP.FHIR.CUSTOM适用场景: 源数据需要在分析前进行转换——例如列名不同、值编码不同,或需要符合OMOP/FHIR等目标标准。
Template D: SQL Data Ingestion
模板D:SQL数据导入
Pull data from an on-prem database into the Rhino platform.
Auth → Project → Connection Details → SQL Query → Import as Dataset → Verify| Step | SDK Method | Notes |
|---|---|---|
| Authenticate | | |
| Get project | | |
| Define connection | | PostgreSQL, MySQL, etc. |
| Run metrics on query | | Does NOT return raw data |
| Import as dataset | | Creates a Dataset from query results |
| Wait | |
Use when: data lives in a relational database and needs to be brought into the platform as a Dataset.
将本地数据库中的数据导入Rhino平台。
认证 → 项目 → 连接详情 → SQL查询 → 导入为数据集 → 验证| 步骤 | SDK方法 | 说明 |
|---|---|---|
| 认证 | | |
| 获取项目 | | |
| 定义连接 | | 支持PostgreSQL、MySQL等 |
| 对查询结果运行指标 | | 不返回原始数据 |
| 导入为数据集 | | 从查询结果创建数据集 |
| 等待完成 | |
适用场景: 数据存储在关系型数据库中,需要将其导入平台作为数据集。
Template E: Model Training + Inference
模板E:模型训练+推理
Train a federated model, then run inference on new data. This is Template B applied twice:
- Train phase: Code Object with training logic → produces model artifacts
- Inference phase: using the trained model
session.code_run.run_inference()
| Step | SDK Method | Notes |
|---|---|---|
| Train (Template B) | | Full code object lifecycle |
| Run inference | | Uses trained model |
| Get model params | | Download model weights |
Use when: federated ML model training and validation.
训练联邦模型,然后在新数据上运行推理。这是模板B的两次应用:
- 训练阶段:包含训练逻辑的代码对象 → 生成模型工件
- 推理阶段:使用训练好的模型调用
session.code_run.run_inference()
| 步骤 | SDK方法 | 说明 |
|---|---|---|
| 训练(模板B) | | 完整的代码对象生命周期 |
| 运行推理 | | 使用训练好的模型 |
| 获取模型参数 | | 下载模型权重 |
适用场景: 联邦机器学习模型的训练与验证。
Template F: Multi-Pipeline Composition
模板F:多管道组合
Chain 2+ templates when a single template cannot satisfy the goal:
| Goal pattern | Composition |
|---|---|
| Harmonize then analyze | Template C → Template A |
| Ingest from SQL then analyze | Template D → Template A |
| Harmonize then train model | Template C → Template E |
| Ingest, harmonize, analyze, train | Template D → Template C → Template A → Template E |
| Custom preprocessing then analytics | Template B → Template A |
Chaining rule: the output datasets of one phase become the input datasets of the next. Use to extract UIDs and pass them forward.
result.output_dataset_uids.root[0].root[0].root[0]当单个模板无法满足目标时,将2个或多个模板链接起来:
| 目标模式 | 组合方式 |
|---|---|
| 先协调再分析 | 模板C → 模板A |
| 从SQL导入后再分析 | 模板D → 模板A |
| 先协调再训练模型 | 模板C → 模板E |
| 导入、协调、分析、训练 | 模板D → 模板C → 模板A → 模板E |
| 自定义预处理后再分析 | 模板B → 模板A |
链接规则: 前一阶段的输出数据集作为下一阶段的输入数据集。使用提取UID并传递给下一阶段。
result.output_dataset_uids.root[0].root[0].root[0]Step 3: Compose the Plan
步骤3:组合计划
- Authentication is always Phase 0 — shared across all phases. Include project and workgroup discovery.
- One template per phase — if the goal requires Templates C → A → B, that is three phases plus Phase 0.
- Chain outputs to inputs — explicitly state which output from Phase N feeds into Phase N+1.
- Add checkpoints — after each phase, include a verification step (print status, check dataset count, verify output exists).
- Surface prerequisites — list what must already exist vs. what will be created.
- Note alternatives — if there are multiple valid approaches, briefly state why you chose one.
- 认证始终是第0阶段——所有阶段共享此步骤。包括项目和工作组的发现。
- 每个阶段对应一个模板——如果目标需要模板C→A→B,则包含第0阶段在内共四个阶段。
- 将输出链接到输入——明确说明第N阶段的哪个输出将作为第N+1阶段的输入。
- 添加检查点——每个阶段结束后,添加验证步骤(如打印状态、检查数据集数量、验证输出是否存在)。
- 列出前置条件——列出哪些资源必须已存在,哪些将由本计划创建。
- 说明替代方案——如果有多种有效方法,简要说明选择当前方案的原因。
Step 4: Generate Implementation
步骤4:生成实现代码
After presenting the plan, generate the complete runnable code following ALL validation rules in Section 6.
在展示计划后,按照第6节中的所有验证规则生成完整的可运行代码。
Plan Output Format
计划输出格式
Structure every planning response as:
undefined所有规划回复均需按照以下结构:
undefinedGoal
目标
[1-2 sentence restatement]
[1-2句话重述目标]
Prerequisites
前置条件
- Must exist: [project, datasets, schemas, workgroup access]
- Created by this plan: [new code objects, schemas, harmonized datasets]
- 必须已存在: [项目、数据集、架构、工作组访问权限]
- 将由本计划创建: [新代码对象、架构、协调后的数据集]
Plan
计划
Phase 0: Setup
第0阶段:设置
- Authenticate and discover project/workgroup/datasets
- Checkpoint: print project name and dataset count
- 认证并发现项目/工作组/数据集
- 检查点:打印项目名称和数据集数量
Phase 1: [Name] — Template [X]
第1阶段:[名称] — 模板[X]
- Step 1.1: [description] —
session.X.method() - Step 1.2: [description] —
session.Y.method() - Checkpoint: [how to verify]
- 步骤1.1:[描述] —
session.X.method() - 步骤1.2:[描述] —
session.Y.method() - 检查点:[验证方式]
Phase 2: [Name] — Template [Y]
第2阶段:[名称] — 模板[Y]
- Depends on: Phase 1 output datasets
- Step 2.1: ...
- Checkpoint: [how to verify]
- 依赖:第1阶段的输出数据集
- 步骤2.1:...
- 检查点:[验证方式]
Alternatives Considered
考虑过的替代方案
[Other approaches and why this plan is preferred]
[其他方法以及选择当前计划的原因]
Implementation
实现代码
[Complete, runnable Python script]
undefined[完整的可运行Python脚本]
undefinedDecision Guidance
决策指南
When the goal is ambiguous, use this table:
| User signal | Template | Reasoning |
|---|---|---|
| "analyze", "measure", "statistics", "compare" | A (Analytics) | Metric-based, no custom code needed |
| "run code", "custom analysis", "process data", "split", "transform" | B (Code Object) | Needs logic beyond built-in metrics |
| "harmonize", "OMOP", "FHIR", "map columns", "standardize" | C (Harmonization) | Data transformation to target model |
| "SQL", "database", "import from DB", "ingest" | D (SQL Ingestion) | Data lives in a relational database |
| "train model", "predict", "inference", "ML" | E (Model Train) | Federated model training + validation |
| Multiple of the above | F (Composition) | Chain templates in dependency order |
当目标不明确时,请使用以下表格:
| 用户信号 | 模板 | 理由 |
|---|---|---|
| “分析”、“测量”、“统计”、“比较” | A(分析) | 基于指标,无需自定义代码 |
| “运行代码”、“自定义分析”、“处理数据”、“划分”、“转换” | B(代码对象) | 需要内置指标之外的逻辑 |
| “协调”、“OMOP”、“FHIR”、“映射列”、“标准化” | C(数据协调) | 需要将数据转换为目标模型 |
| “SQL”、“数据库”、“从DB导入”、“导入” | D(SQL导入) | 数据存储在关系型数据库中 |
| “训练模型”、“预测”、“推理”、“ML” | E(模型训练) | 联邦模型训练与验证 |
| 以上多种信号 | F(组合) | 按依赖顺序链接模板 |
Validation Checklist
验证清单
Apply every item to ALL generated code — plans and standalone scripts alike.
所有生成的代码(包括计划和独立脚本)均需满足以下所有条件:
Endpoint Accessors
端点访问器
| Operation | Correct accessor |
|---|---|
| Project-level operations, aggregate/joined metrics | |
| Dataset-level operations, per-site metrics | |
| Code objects, builds, runs, harmonization | |
| Run status, inference results | |
| SQL queries | |
| Semantic mappings, vocabularies | |
| Syntactic mappings, harmonization config | |
| Data schemas | |
| 操作 | 正确的访问器 |
|---|---|
| 项目级操作、聚合/联合指标 | |
| 数据集级操作、单站点指标 | |
| 代码对象、构建、运行、协调任务 | |
| 运行状态、推理结果 | |
| SQL查询 | |
| 语义映射、词汇表 | |
| 语法映射、协调配置 | |
| 数据架构 | |
Environment
环境
- Default connects to production. For dev/QA/staging, pass
rh.login():rhino_api_urlrh.login(..., rhino_api_url=ApiEnvironment.DEV1_AWS_URL) - Import:
from rhino_health.lib.constants import ApiEnvironment - If user mentions dev1/dev2/QA/staging environment, ALWAYS add parameter
rhino_api_url
- 默认连接到生产环境。对于开发/QA/预发布环境,请传递
rh.login()参数:rhino_api_urlrh.login(..., rhino_api_url=ApiEnvironment.DEV1_AWS_URL) - 导入语句:
from rhino_health.lib.constants import ApiEnvironment - 如果用户提到dev1/dev2/QA/预发布环境,必须添加参数
rhino_api_url
Import Paths
导入路径
| Wrong | Correct |
|---|---|
| |
| |
| 错误写法 | 正确写法 |
|---|---|
| |
| |
Metric Calls
指标调用
- takes
aggregate_dataset_metricof UIDs:List[str][str(d.uid) for d in datasets] - takes a single
get_dataset_metricdataset_uid: str - takes
joined_dataset_metricand optionalquery_datasetsasfilter_datasetsList[str] - Metric config objects require (not
data_columnorcolumn)field - uses keys:
FilterVariable,data_column,filter_column,filter_valuefilter_type
- 接收UID的
aggregate_dataset_metric:List[str][str(d.uid) for d in datasets] - 接收单个
get_dataset_metricdataset_uid: str - 接收
joined_dataset_metric和可选的query_datasets,均为filter_datasetsList[str] - 指标配置对象需要参数(而非
data_column或column)field - 使用以下键:
FilterVariable、data_column、filter_column、filter_valuefilter_type
CreateInput Alias Fields
CreateInput别名字段
| Field name | Alias (use this) |
|---|---|
| |
| |
| 字段名 | 别名(请使用此别名) |
|---|---|
| |
| |
Nested Structures & RootModels
嵌套结构与RootModels
- is
CodeObjectRunInput.input_dataset_uids:List[List[str]][[uid1, uid2]] - is triply nested RootModel: access via
output_dataset_uids.root[0].root[0].root[0] - is a
DataSchema.schema_fieldsRootModel: access list viaSchemaFields, names via.root.field_names - format:
group_by{"groupings": [{"data_column": "col"}]} - list:
data_filters[FilterVariable(data_column="col", filter_column="col", filter_value="val", filter_type=FilterType.EQUALS)] - Enum display: use for clean strings (e.g.
.value→status.value)'Approved'
- 的类型为
CodeObjectRunInput.input_dataset_uids:List[List[str]][[uid1, uid2]] - 是三层嵌套的RootModel:通过
output_dataset_uids访问.root[0].root[0].root[0] - 是
DataSchema.schema_fields类型的RootModel:通过SchemaFields访问列表,通过.root访问字段名.field_names - 格式:
group_by{"groupings": [{"data_column": "col"}]} - 列表:
data_filters[FilterVariable(data_column="col", filter_column="col", filter_value="val", filter_type=FilterType.EQUALS)] - 枚举显示:使用获取清晰的字符串(例如
.value→status.value)'Approved'
Async Operations
异步操作
- Call after creating Generalized Compute code objects
wait_for_build() - Call after
wait_for_completion(),run_code_object(),run_data_harmonization()run_sql_query()
- 创建Generalized Compute类型的代码对象后,调用
wait_for_build() - 调用、
run_code_object()、run_data_harmonization()后,调用run_sql_query()wait_for_completion()
None Checks
None检查
Every call must be followed by a None check:
get_*_by_name()python
dataset = project.get_dataset_by_name("Name")
if dataset is None:
raise ValueError("Dataset not found")每个调用后必须添加None检查:
get_*_by_name()python
dataset = project.get_dataset_by_name("Name")
if dataset is None:
raise ValueError("Dataset not found")Code Template
代码模板
Every generated script must follow this structure:
python
import rhino_health as rh
from getpass import getpass所有生成的脚本必须遵循以下结构:
python
import rhino_health as rh
from getpass import getpass... additional imports ...
... 其他导入语句 ...
For non-production environments, add rhino_api_url:
对于非生产环境,请添加rhino_api_url参数:
from rhino_health.lib.constants import ApiEnvironment
from rhino_health.lib.constants import ApiEnvironment
session = rh.login(username="my_email@example.com", password=getpass(),
session = rh.login(username="my_email@example.com", password=getpass(),
rhino_api_url=ApiEnvironment.DEV1_AWS_URL)
rhino_api_url=ApiEnvironment.DEV1_AWS_URL)
session = rh.login(username="my_email@example.com", password=getpass())
PROJECT_NAME = "My Project"
session = rh.login(username="my_email@example.com", password=getpass())
PROJECT_NAME = "My Project"
... constants ...
... 常量定义 ...
project = session.project.get_project_by_name(PROJECT_NAME)
if project is None:
raise ValueError(f"Project '{PROJECT_NAME}' not found")
project = session.project.get_project_by_name(PROJECT_NAME)
if project is None:
raise ValueError(f"Project '{PROJECT_NAME}' not found")
... core logic ...
... 核心逻辑 ...
print(result)
undefinedprint(result)
undefinedMetric Selection Tree
指标选择树
Map natural language to the right metric class:
| User asks about... | Metric class | Category |
|---|---|---|
| Counts, frequencies | | Basic |
| Averages, means | | Basic |
| Spread, variability | | Basic |
| Totals, sums | | Basic |
| Percentiles, medians, quartiles | | Quantile |
| Survival time, time-to-event | | Survival |
| Hazard ratios, covariates + survival | | Survival |
| ROC curves, AUC | | ROC/AUC |
| ROC with confidence intervals | | ROC/AUC |
| Correlation between variables | | Statistics |
| Inter-rater reliability | | Statistics |
| Compare two group means | | Statistics |
| Compare 3+ group means | | Statistics |
| Categorical association | | Statistics |
| 2x2 contingency table | | Epidemiology |
| Odds ratio | | Epidemiology |
| Risk ratio / relative risk | | Epidemiology |
| Risk difference | | Epidemiology |
| Incidence rates | | Epidemiology |
All metrics:
from rhino_health.lib.metrics import ClassName将自然语言查询映射到正确的指标类:
| 用户询问的内容 | 指标类 | 分类 |
|---|---|---|
| 计数、频率 | | 基础指标 |
| 平均值、均值 | | 基础指标 |
| 离散度、变异性 | | 基础指标 |
| 总和、总计 | | 基础指标 |
| 百分位数、中位数、四分位数 | | 分位数指标 |
| 生存时间、事件发生时间 | | 生存分析 |
| 风险比、协变量+生存分析 | | 生存分析 |
| ROC曲线、AUC | | ROC/AUC |
| 带置信区间的ROC | | ROC/AUC |
| 变量间相关性 | | 统计分析 |
| 评分者信度 | | 统计分析 |
| 两组均值比较 | | 统计分析 |
| 三组及以上均值比较 | | 统计分析 |
| 分类变量关联性 | | 统计分析 |
| 2x2列联表 | | 流行病学 |
| 比值比 | | 流行病学 |
| 风险比/相对风险 | | 流行病学 |
| 风险差 | | 流行病学 |
| 发病率 | | 流行病学 |
所有指标的导入方式:
from rhino_health.lib.metrics import ClassNameExecution modes
执行模式
| Scope | Method |
|---|---|
| Single site | |
| Aggregated across sites | |
| Federated join | |
| 范围 | 方法 |
|---|---|
| 单站点 | |
| 跨站点聚合 | |
| 联邦连接 | |
Filtering example
过滤示例
python
from rhino_health.lib.metrics import Mean, FilterType, FilterVariable
config = Mean(
variable="Height",
data_filters=[
FilterVariable(
data_column="Gender",
filter_column="Gender",
filter_value="Female",
filter_type=FilterType.EQUALS,
)
],
group_by={"groupings": ["Gender"]},
)python
from rhino_health.lib.metrics import Mean, FilterType, FilterVariable
config = Mean(
variable="Height",
data_filters=[
FilterVariable(
data_column="Gender",
filter_column="Gender",
filter_value="Female",
filter_type=FilterType.EQUALS,
)
],
group_by={"groupings": ["Gender"]},
)Error-to-Fix Reference
错误与修复参考
When the user encounters an error, diagnose using this table:
| Error pattern | Root cause | Fix |
|---|---|---|
| Token expired, wrong creds, or MFA | Re-login; pass |
| HTTP 401 with correct credentials | Wrong environment URL | Add |
| | Add None check after every |
| Wrong field names — alias confusion | Use aliases: |
| String where FilterVariable expected | Use |
| Wrong import path | |
| | Convert: |
| Accessing as flat list | Use |
| | Use |
| Default timeout too low | Increase |
| | Must be double-nested: |
| Wrong | Verify column name matches dataset schema (case-sensitive) |
Enum shows full path (e.g. | Printing enum object directly | Use |
| SDK/API version mismatch — backend added new value | Use |
Diagnostic process: identify exception class → locate failing SDK call → cross-reference correct signature in → check for compound errors.
references/sdk_reference.md当用户遇到错误时,请使用以下表格进行诊断:
| 错误模式 | 根本原因 | 修复方法 |
|---|---|---|
| 令牌过期、凭据错误或MFA未验证 | 重新登录;如果启用了MFA,请传递 |
| 凭据正确但仍返回HTTP 401 | 环境URL错误 | 添加 |
| | 在每个 |
| 字段名错误——混淆了别名 | 使用别名: |
指标配置中的 | 应该传入FilterVariable的地方传入了字符串 | 使用 |
| 导入路径错误 | 使用 |
| 传入了 | 转换为: |
| 当作扁平列表访问 | 使用 |
| | 使用 |
| 默认超时时间过短 | 在 |
| 传入了 | 必须是双层嵌套: |
指标结果中的 | | 验证列名是否与数据集架构匹配(区分大小写) |
枚举显示完整路径(如 | 直接打印枚举对象 | 使用 |
枚举字段的 | SDK/API版本不匹配——后端添加了新值 | 使用 |
诊断流程:识别异常类 → 定位失败的SDK调用 → 在中交叉引用正确的方法签名 → 检查是否存在复合错误。
references/sdk_reference.mdQuestion Routing
问题路由
For non-planning SDK questions, locate the right context section:
| Question type | Source file | Section |
|---|---|---|
| Authentication, login, MFA | patterns_and_gotchas.md | §1 |
| Finding projects/datasets by name | patterns_and_gotchas.md | §2 |
| Creating/updating resources (upsert) | patterns_and_gotchas.md | §3 |
| Running per-site or aggregated metrics | patterns_and_gotchas.md | §4 |
| Filtering data | patterns_and_gotchas.md | §5 |
| Group-by analysis | patterns_and_gotchas.md | §6 |
| Federated joins | patterns_and_gotchas.md | §7 |
| Code objects (create, build, run) | patterns_and_gotchas.md | §8 |
| Async operations / waiting | patterns_and_gotchas.md | §9 |
| Correct import paths | patterns_and_gotchas.md | §11 |
| Environment URL (dev1, QA, staging) | patterns_and_gotchas.md | §13 |
| RootModel access (SchemaFields, output UIDs) | patterns_and_gotchas.md | §14 |
| Semantic mapping entries / data | patterns_and_gotchas.md | §15 |
| Session persistence / SSO | patterns_and_gotchas.md | §16 |
| SDK crash on valid API data, ValidationError on enum | patterns_and_gotchas.md | §17 |
| Raw API calls, session.get(), bypassing Pydantic | patterns_and_gotchas.md | §17 |
| Vocabularies, vocabulary types | sdk_reference.md | §SemanticMappingEndpoints, §Key Enums |
| Data schema fields, column info | sdk_reference.md | §DataSchema, §SchemaFields |
| Specific endpoint methods | sdk_reference.md | §[EndpointName]Endpoints |
| Enums and constants | sdk_reference.md | §Key Enums |
| API environment URLs | sdk_reference.md | §ApiEnvironment |
| Metric configuration | metrics_reference.md | §[Category] |
| "Which metric for...?" | metrics_reference.md | §Quick Decision Guide |
对于非规划类的SDK问题,请定位到正确的上下文部分:
| 问题类型 | 源文件 | 章节 |
|---|---|---|
| 认证、登录、MFA | patterns_and_gotchas.md | 第1节 |
| 通过名称查找项目/数据集 | patterns_and_gotchas.md | 第2节 |
| 创建/更新资源(upsert) | patterns_and_gotchas.md | 第3节 |
| 运行单站点或聚合指标 | patterns_and_gotchas.md | 第4节 |
| 数据过滤 | patterns_and_gotchas.md | 第5节 |
| 分组分析 | patterns_and_gotchas.md | 第6节 |
| 联邦连接 | patterns_and_gotchas.md | 第7节 |
| 代码对象(创建、构建、运行) | patterns_and_gotchas.md | 第8节 |
| 异步操作 / 等待 | patterns_and_gotchas.md | 第9节 |
| 正确的导入路径 | patterns_and_gotchas.md | 第11节 |
| 环境URL(dev1、QA、预发布) | patterns_and_gotchas.md | 第13节 |
| RootModel访问(SchemaFields、输出UID) | patterns_and_gotchas.md | 第14节 |
| 语义映射条目 / 数据 | patterns_and_gotchas.md | 第15节 |
| 会话持久化 / SSO | patterns_and_gotchas.md | 第16节 |
| SDK在有效API数据上崩溃、枚举的ValidationError | patterns_and_gotchas.md | 第17节 |
| 原始API调用、session.get()、绕过Pydantic | patterns_and_gotchas.md | 第17节 |
| 词汇表、词汇表类型 | sdk_reference.md | §SemanticMappingEndpoints、§Key Enums |
| 数据架构字段、列信息 | sdk_reference.md | §DataSchema、§SchemaFields |
| 特定端点方法 | sdk_reference.md | §[EndpointName]Endpoints |
| 枚举与常量 | sdk_reference.md | §Key Enums |
| API环境URL | sdk_reference.md | §ApiEnvironment |
| 指标配置 | metrics_reference.md | §[分类] |
| “哪种指标适用于...?” | metrics_reference.md | §快速决策指南 |
Working Examples
可用示例
Match the user's goal to verified working examples from :
references/examples/INDEX.md| Template | Example files |
|---|---|
| A (Analytics) | |
| B (Code Object) | |
| C (Harmonization) | |
| D (SQL Ingestion) | |
| E (Model Training) | |
| F (Composition) | |
Read the relevant example file before generating code to follow its proven patterns.
将用户的目标与中经过验证的可用示例匹配:
references/examples/INDEX.md| 模板 | 示例文件 |
|---|---|
| A(分析) | |
| B(代码对象) | |
| C(数据协调) | |
| D(SQL导入) | |
| E(模型训练) | |
| F(组合) | |
在生成代码前,请阅读相关示例文件,遵循其已验证的模式。