method-transfer-engine
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMethod Transfer Engine
方法迁移引擎
Rigorous framework for adapting statistical methods across domains and settings
Use this skill when: adapting a method from one field to another, extending a method to a new setting, formalizing an intuitive connection between methods, or verifying that a transferred method retains its properties.
用于跨领域和场景适配统计方法的严谨框架
当你需要将一个方法从一个领域适配到另一个领域、将方法扩展到新场景、形式化方法间的直观关联,或验证迁移后的方法是否保留其特性时,使用本技能。
The Transfer Framework
迁移框架
What is Method Transfer?
什么是方法迁移?
Taking a technique that works in Setting A and adapting it to work in Setting B, while:
- Preserving desirable theoretical properties
- Identifying what changes are needed
- Understanding what can and cannot transfer
将在场景A中有效的技术适配到场景B中,同时:
- 保留理想的理论特性
- 确定需要做出的变更
- 明确哪些内容可以迁移、哪些不能
Transfer Quality Spectrum
迁移质量范围
Direct Application → Minor Adaptation → Major Modification → Inspired-By
│ │ │ │
Same theory Adjust for Rewrite theory New method,
applies new setting for new setting similar spiritDirect Application → Minor Adaptation → Major Modification → Inspired-By
│ │ │ │
Same theory Adjust for Rewrite theory New method,
applies new setting for new setting similar spiritTransfer Success Criteria
迁移成功标准
A successful transfer must:
- Solve the target problem - Method actually helps in new setting
- Preserve key properties - Consistency, efficiency, robustness transfer
- Have clear assumptions - Know what's required in new setting
- Be verifiable - Can prove/simulate that it works
- Add value - Better than existing approaches
一次成功的迁移必须满足:
- 解决目标问题 - 方法在新场景中切实有效
- 保留关键特性 - 一致性、效率、鲁棒性等特性得以迁移
- 假设清晰明确 - 明确新场景下的要求
- 可验证 - 能够通过证明或模拟验证其有效性
- 具备价值 - 优于现有方法
The 6-Phase Protocol
六阶段协议
This protocol provides a systematic approach to method transfer, covering all critical steps from source extraction through validation.
本协议提供了方法迁移的系统化流程,涵盖从源方法提取到验证的所有关键步骤。
Source Extraction
源方法提取
Goal: Extract the core mathematical and algorithmic essence of the source method
r
undefined目标:提取源方法的核心数学和算法本质
r
undefinedTemplate for source method extraction
Template for source method extraction
extract_source_method <- function(method_name, reference) {
list(
name = method_name,
estimand = "formal expression of what is estimated",
estimator = "formula for the estimator",
assumptions = c("A1: condition", "A2: condition"),
properties = c("consistency", "asymptotic normality"),
algorithm = c("Step 1: ...", "Step 2: ..."),
complexity = "O(n^2) or similar"
)
}
extract_source_method <- function(method_name, reference) {
list(
name = method_name,
estimand = "formal expression of what is estimated",
estimator = "formula for the estimator",
assumptions = c("A1: condition", "A2: condition"),
properties = c("consistency", "asymptotic normality"),
algorithm = c("Step 1: ...", "Step 2: ..."),
complexity = "O(n^2) or similar"
)
}
Example: Extract Lasso from signal processing
Example: Extract Lasso from signal processing
lasso_extraction <- list(
name = "Lasso/Basis Pursuit",
field = "Signal Processing / Compressed Sensing",
estimand = "argmin ||y - Xb||_2^2 + lambda * ||b||_1",
key_insight = "L1 penalty induces sparsity via soft thresholding",
assumptions = c("RIP condition", "Incoherence"),
properties = c("Sparse solution", "Variable selection consistency")
)
undefinedlasso_extraction <- list(
name = "Lasso/Basis Pursuit",
field = "Signal Processing / Compressed Sensing",
estimand = "argmin ||y - Xb||_2^2 + lambda * ||b||_1",
key_insight = "L1 penalty induces sparsity via soft thresholding",
assumptions = c("RIP condition", "Incoherence"),
properties = c("Sparse solution", "Variable selection consistency")
)
undefinedAbstraction
抽象化
Goal: Identify the abstract mathematical structure that enables the method
r
undefined目标:识别使方法生效的抽象数学结构
r
undefinedAbstract structure identification
Abstract structure identification
identify_abstraction <- function(source_method) {
list(
mathematical_structure = "e.g., M-estimation, U-statistics, kernels",
core_operation = "e.g., reweighting, regularization, projection",
information_used = "e.g., first moments, covariance, distributional",
key_invariance = "what property makes it work",
generalization_path = "how to extend beyond original setting"
)
}
identify_abstraction <- function(source_method) {
list(
mathematical_structure = "e.g., M-estimation, U-statistics, kernels",
core_operation = "e.g., reweighting, regularization, projection",
information_used = "e.g., first moments, covariance, distributional",
key_invariance = "what property makes it work",
generalization_path = "how to extend beyond original setting"
)
}
Example: Abstraction of propensity score methods
Example: Abstraction of propensity score methods
propensity_abstraction <- list(
mathematical_structure = "Reweighting to balance distributions",
core_operation = "Inverse probability weighting",
invariance = "Balances covariate distribution across groups",
generalization = "Any selection mechanism with known probabilities"
)
---propensity_abstraction <- list(
mathematical_structure = "Reweighting to balance distributions",
core_operation = "Inverse probability weighting",
invariance = "Balances covariate distribution across groups",
generalization = "Any selection mechanism with known probabilities"
)
---Phase 1: Source Method Analysis
阶段1:源方法分析
Goal: Deeply understand what you're transferring
markdown
undefined目标:深入理解你要迁移的内容
markdown
undefinedSource Method Profile
源方法概况
Basic Information
基本信息
- Name: [Method name]
- Source field: [Domain/area]
- Key reference: [Citation]
- What it does: [One sentence]
- 名称: [方法名称]
- 源领域: [领域/方向]
- 关键参考文献: [引用文献]
- 功能: [一句话描述]
Problem Solved
解决的问题
- Input: [What data/information goes in]
- Output: [What estimate/inference comes out]
- Setting: [When it applies]
- 输入: [输入的数据/信息]
- 输出: [输出的估计值/推断结果]
- 适用场景: [适用的条件]
Mathematical Structure
数学结构
- Estimand: [What it estimates, formally]
- Estimator: [How it estimates, formula]
- Loss/objective: [What it optimizes]
- Estimand: [形式化描述其估计的内容]
- Estimator: [估计方法的公式]
- 损失/目标函数: [其优化的目标]
Assumptions Required
所需假设
-
[Assumption 1]: [Mathematical statement]
- Why needed: [Role in proof/method]
- When violated: [Failure mode]
-
- 必要性: [在证明/方法中的作用]
- 违反时的情况: [失效模式]
Theoretical Properties
理论特性
- Consistency: [When/how proved]
- Rate: [Convergence rate]
- Asymptotic distribution: [If known]
- Efficiency: [Relative to what]
- Robustness: [To what violations]
- 一致性: [证明的时机/方式]
- 收敛速率: [收敛速度]
- 渐近分布: [若已知]
- 效率: [相对基准]
- 鲁棒性: [对哪些违反情况具备鲁棒性]
Computational Aspects
计算相关
- Algorithm: [How implemented]
- Complexity: [Time/space]
- Software: [Available implementations]
undefined- 算法: [实现方式]
- 复杂度: [时间/空间复杂度]
- 软件: [可用的实现工具]
undefinedPhase 2: Target Problem Analysis
阶段2:目标问题分析
Goal: Understand where you want to apply it
markdown
undefined目标:理解你要应用方法的场景
markdown
undefinedTarget Problem Profile
目标问题概况
Basic Information
基本信息
- Problem name: [Description]
- Target field: [Domain/area]
- Motivation: [Why solve this]
- 问题名称: [描述]
- 目标领域: [领域/方向]
- 动机: [解决该问题的原因]
Problem Structure
问题结构
- Data available: [What's observed]
- Estimand: [What you want to estimate]
- Challenges: [Why existing methods inadequate]
- 可用数据: [可观测的数据]
- Estimand: [你要估计的内容]
- 挑战: [现有方法的不足]
Current Approaches
当前方法
- Method 1: [Name, limitations]
- Method 2: [Name, limitations]
- Gap: [What's missing]
- 方法1: [名称,局限性]
- 方法2: [名称,局限性]
- 差距: [缺失的内容]
Constraints
约束条件
- Assumptions willing to make: [List]
- Assumptions NOT willing to make: [List]
- Computational constraints: [If any]
undefined- 愿意做出的假设: [列表]
- 不愿做出的假设: [列表]
- 计算约束: [若有]
undefinedTarget Mapping
目标映射
Goal: Map source concepts to their target domain counterparts
r
undefined目标:将源概念映射到目标领域的对应概念
r
undefinedTarget mapping framework
Target mapping framework
create_target_mapping <- function(source, target) {
mapping <- list(
objects = data.frame(
source = c("treatment", "outcome", "confounder"),
target = c("mediator", "effect", "moderator"),
relationship = c("direct", "indirect", "modifies")
),
assumptions = data.frame(
source_assumption = c("SUTVA", "Ignorability"),
target_version = c("Consistency", "Sequential ignorability"),
status = c("transfers", "needs modification")
)
)
mapping
}
create_target_mapping <- function(source, target) {
mapping <- list(
objects = data.frame(
source = c("treatment", "outcome", "confounder"),
target = c("mediator", "effect", "moderator"),
relationship = c("direct", "indirect", "modifies")
),
assumptions = data.frame(
source_assumption = c("SUTVA", "Ignorability"),
target_version = c("Consistency", "Sequential ignorability"),
status = c("transfers", "needs modification")
)
)
mapping
}
Example: IV to Mendelian randomization mapping
Example: IV to Mendelian randomization mapping
iv_to_mr <- list(
price_instrument = "genetic_variant",
demand = "biomarker_exposure",
endogeneity = "unmeasured_confounding",
exclusion = "pleiotropic_effects",
key_difference = "biological vs economic mechanisms"
)
undefinediv_to_mr <- list(
price_instrument = "genetic_variant",
demand = "biomarker_exposure",
endogeneity = "unmeasured_confounding",
exclusion = "pleiotropic_effects",
key_difference = "biological vs economic mechanisms"
)
undefinedPhase 3: Structure Mapping
阶段3:结构映射
Goal: Identify correspondences between source and target
markdown
undefined目标:识别源与目标之间的对应关系
markdown
undefinedStructure Map
结构映射表
Object Correspondence
对象对应关系
| Source | Target | Notes |
|---|---|---|
| [Source object 1] | [Target object 1] | [How they relate] |
| [Source object 2] | [Target object 2] | [How they relate] |
| ... | ... | ... |
| 源对象 | 目标对象 | 说明 |
|---|---|---|
| [源对象1] | [目标对象1] | [关联方式] |
| [源对象2] | [目标对象2] | [关联方式] |
| ... | ... | ... |
Assumption Correspondence
假设对应关系
| Source Assumption | Target Version | Status |
|---|---|---|
| [Source A1] | [Target A1'] | ✓ Transfers / ✗ Fails / ? Modify |
| [Source A2] | [Target A2'] | ... |
| ... | ... | ... |
| 源假设 | 目标版本 | 状态 |
|---|---|---|
| [源假设A1] | [目标假设A1'] | ✓ 可迁移 / ✗ 不成立 / ? 需要修改 |
| [源假设A2] | [目标假设A2'] | ... |
| ... | ... | ... |
What Transfers Directly
可直接迁移的内容
- [Property 1]: Because [reason]
- [Property 2]: Because [reason]
What Needs Modification
需要修改的内容
- [Element 1]: From [source version] to [target version]
- Why: [Reason for change]
- How: [Specific modification]
-
- 原因: [修改的理由]
- 方式: [具体修改方案]
What Doesn't Transfer
不可迁移的内容
- [Element 1]: Because [reason]
- Impact: [What we lose]
- Alternative: [How to address]
undefined-
- 影响: [损失的内容]
- 替代方案: [解决方法]
undefinedGap Analysis
差距分析
Goal: Identify what doesn't transfer and what modifications are needed
r
undefined目标:识别无法迁移的内容以及需要的修改
r
undefinedGap analysis framework
Gap analysis framework
analyze_transfer_gaps <- function(source, target, mapping) {
gaps <- list(
assumption_gaps = list(
violated = c("iid assumption in clustered data"),
modified = c("independence -> conditional independence"),
new_required = c("mediator positivity")
),
property_gaps = list(
lost = c("efficiency under misspecification"),
weakened = c("convergence rate n^{-1/2} -> n^{-1/4}"),
preserved = c("consistency", "asymptotic normality")
),
computational_gaps = list(
new_challenges = c("non-convex optimization"),
workarounds = c("ADMM algorithm", "approximate methods")
),
bridging_strategies = c(
"Add regularization for new setting",
"Derive modified variance estimator",
"Implement robustness check"
))
gaps
}
undefinedanalyze_transfer_gaps <- function(source, target, mapping) {
gaps <- list(
assumption_gaps = list(
violated = c("iid assumption in clustered data"),
modified = c("independence -> conditional independence"),
new_required = c("mediator positivity")
),
property_gaps = list(
lost = c("efficiency under misspecification"),
weakened = c("convergence rate n^{-1/2} -> n^{-1/4}"),
preserved = c("consistency", "asymptotic normality")
),
computational_gaps = list(
new_challenges = c("non-convex optimization"),
workarounds = c("ADMM algorithm", "approximate methods")
),
bridging_strategies = c(
"Add regularization for new setting",
"Derive modified variance estimator",
"Implement robustness check"
))
gaps
}
undefinedPhase 4: Adaptation Design
阶段4:适配设计
Goal: Design the transferred method
markdown
undefined目标:设计迁移后的方法
markdown
undefinedAdapted Method Design
适配后方法设计
Overview
概述
[One paragraph describing the adapted method]
[一段描述适配后方法的文字]
Formal Definition
形式化定义
Estimand:
$$\psi = [target estimand formula]$$
Estimator:
$$\hat{\psi}_n = [adapted estimator formula]$$
Algorithm:
- [Step 1]
- [Step 2]
- ...
Estimand:
$$\psi = [target estimand formula]$$
Estimator:
$$\hat{\psi}_n = [adapted estimator formula]$$
算法:
- [步骤1]
- [步骤2]
- ...
Modified Assumptions
修改后的假设
- [Assumption A1']: [New statement for target setting]
- Analogous to: [Source assumption]
- Modified because: [Reason]
-
- 对应源假设: [源领域的假设]
- 修改原因: [理由]
Expected Properties
预期特性
- Consistency: [Conjecture/claim]
- Rate: [Expected]
- Efficiency: [Expected]
- 一致性: [推测/声明]
- 收敛速率: [预期速率]
- 效率: [预期效率]
Key Differences from Source
与源方法的关键差异
undefinedundefinedValidation
验证
Goal: Systematically verify the transferred method works correctly
r
undefined目标:系统验证迁移后的方法是否正常工作
r
undefinedComprehensive validation framework for method transfer
Comprehensive validation framework for method transfer
validate_transfer <- function(adapted_method, n_sims = 1000) {
results <- list()
1. Bias check: Is estimator unbiased at truth?
results$bias <- run_bias_simulation(adapted_method, n_sims)
2. Coverage check: Do CIs achieve nominal coverage?
results$coverage <- run_coverage_simulation(adapted_method, n_sims)
3. Efficiency check: Compare to alternatives
results$efficiency <- compare_to_alternatives(adapted_method)
4. Robustness check: Behavior under violations
results$robustness <- test_assumption_violations(adapted_method)
5. Edge cases: Extreme scenarios
results$edge_cases <- test_edge_cases(adapted_method)
Validation report
list(
passed = all(sapply(results, function(x) x$passed)),
details = results,
recommendations = generate_recommendations(results)
)
}
validate_transfer <- function(adapted_method, n_sims = 1000) {
results <- list()
1. Bias check: Is estimator unbiased at truth?
results$bias <- run_bias_simulation(adapted_method, n_sims)
2. Coverage check: Do CIs achieve nominal coverage?
results$coverage <- run_coverage_simulation(adapted_method, n_sims)
3. Efficiency check: Compare to alternatives
results$efficiency <- compare_to_alternatives(adapted_method)
4. Robustness check: Behavior under violations
results$robustness <- test_assumption_violations(adapted_method)
5. Edge cases: Extreme scenarios
results$edge_cases <- test_edge_cases(adapted_method)
Validation report
list(
passed = all(sapply(results, function(x) x$passed)),
details = results,
recommendations = generate_recommendations(results)
)
}
Simulation template for validation
Simulation template for validation
run_transfer_validation <- function(n = 500, n_sims = 1000) {
estimates <- replicate(n_sims, {
# Generate data under true model
data <- generate_dgp(n)
# Apply transferred method
est <- adapted_method(data)
c(estimate = est$point, se = est$se)})
list(
bias = mean(estimates["estimate", ]) - true_value,
rmse = sqrt(mean((estimates["estimate", ] - true_value)^2)),
coverage = mean(abs(estimates["estimate", ] - true_value) <
1.96 * estimates["se", ])
)
}
undefinedrun_transfer_validation <- function(n = 500, n_sims = 1000) {
estimates <- replicate(n_sims, {
# Generate data under true model
data <- generate_dgp(n)
# Apply transferred method
est <- adapted_method(data)
c(estimate = est$point, se = est$se)})
list(
bias = mean(estimates["estimate", ]) - true_value,
rmse = sqrt(mean((estimates["estimate", ] - true_value)^2)),
coverage = mean(abs(estimates["estimate", ] - true_value) <
1.96 * estimates["se", ])
)
}
undefinedPhase 5: Verification
阶段5:证明验证
Goal: Prove/demonstrate the transfer works
markdown
undefined目标:证明或演示迁移有效
markdown
undefinedVerification Plan
验证计划
Theoretical Verification
理论验证
-
Consistency proof
- Approach: [Proof strategy]
- Key lemma: [What needs to be shown]
-
Asymptotic normality
- Approach: [Proof strategy]
- Influence function: [If applicable]
-
Efficiency (if claiming)
- Approach: [Efficiency bound derivation]
-
一致性证明
- 方法: [证明策略]
- 关键引理: [需要证明的内容]
-
渐近正态性
- 方法: [证明策略]
- 影响函数: [若适用]
-
效率验证(若声明)
- 方法: [效率边界推导]
Simulation Verification
模拟验证
-
Scenario 1: [Description]
- DGP: [Data generating process]
- Expected result: [What should happen]
-
Scenario 2: Comparison to oracle
- Purpose: [Verify optimality]
-
Scenario 3: Stress test
- Purpose: [Find failure modes]
-
场景1: [描述]
- DGP: [数据生成过程]
- 预期结果: [应出现的情况]
-
场景2: 与Oracle方法对比
- 目的: [验证最优性]
-
场景3: 压力测试
- 目的: [找出失效模式]
Empirical Verification
实证验证
- Benchmark dataset: [If available]
- Real application: [Domain]
undefined- 基准数据集: [若有]
- 实际应用: [领域]
undefinedPhase 6: Documentation
阶段6:文档化
Goal: Document for publication
markdown
undefined目标:为发表整理文档
markdown
undefinedTransfer Documentation
迁移文档
Contribution Statement
贡献声明
"We adapt [source method] from [source field] to [target setting] by
[key modification]. Our adapted method [key property]. Unlike [alternative],
our approach [advantage]."
"我们将[源方法]从[源领域]适配到[目标场景],主要修改为[关键变更]。适配后的方法具备[关键特性]。与[替代方法]不同,我们的方法[优势]。"
Theoretical Contribution
理论贡献
- New result 1: [Theorem statement]
- New result 2: [If applicable]
- 新成果1: [定理表述]
- 新成果2: [若有]
Methodological Contribution
方法学贡献
- Adaptation insight: [What's novel about the transfer]
- Practical guidance: [When to use]
- 适配洞见: [迁移中的创新点]
- 实践指南: [适用场景]
What We Learned
收获
- About source method: [New understanding]
- About target problem: [New understanding]
- General principle: [Broader insight]
---- 关于源方法: [新的理解]
- 关于目标问题: [新的理解]
- 通用原则: [更广泛的洞见]
---Common Transfer Patterns
常见迁移模式
Pattern 1: Estimator Family Transfer
模式1:估计器家族迁移
Template: Estimator type from one setting to another
Example: IPW from survey sampling → causal inference
Source: Horvitz-Thompson estimator
E[Y] ≈ Σᵢ Yᵢ/πᵢ where πᵢ = P(selected)
Target: IPW for ATE
E[Y(1)] ≈ Σᵢ Yᵢ·Aᵢ/e(Xᵢ) where e(x) = P(A=1|X=x)
Mapping:
- Selection indicator → Treatment indicator
- Selection probability → Propensity score
- Survey weights → Inverse propensity weights
Key insight: Both correct for selection bias via reweighting模板:将某类估计器从一个场景迁移到另一个场景
示例:IPW从调查抽样→因果推断
源方法: Horvitz-Thompson估计器
E[Y] ≈ Σᵢ Yᵢ/πᵢ 其中 πᵢ = P(被选中)
目标方法: 用于ATE的IPW
E[Y(1)] ≈ Σᵢ Yᵢ·Aᵢ/e(Xᵢ) 其中 e(x) = P(A=1|X=x)
映射关系:
- 选择指示符 → 处理指示符
- 选择概率 → Propensity Score
- 调查权重 → 逆Propensity权重
核心洞见: 两者均通过重加权纠正选择偏差Pattern 2: Robustness Property Transfer
模式2:鲁棒性特性迁移
Template: Robustness technique from one method to another
Example: Double robustness from missing data → causal inference
Source: Augmented IPW for missing data
DR = IPW + Imputation - (IPW × Imputation)
Target: AIPW for causal effects
Same structure but for counterfactual outcomes
Mapping:
- Missing indicator → Treatment indicator
- Missingness model → Propensity model
- Imputation model → Outcome model
Key insight: Product-form bias enables robustness to one misspecification模板:将鲁棒性技术从一个方法迁移到另一个方法
示例:双重鲁棒性从缺失数据→因果推断
源方法: 用于缺失数据的增强IPW
DR = IPW + 插补 - (IPW × 插补)
目标方法: 用于因果效应的AIPW
结构相同,但针对反事实结果
映射关系:
- 缺失指示符 → 处理指示符
- 缺失模型 → Propensity模型
- 插补模型 → 结果模型
核心洞见: 乘积形式的偏差使其对单一模型误设具备鲁棒性Pattern 3: Asymptotic Result Transfer
模式3:渐近结果迁移
Template: Asymptotic theory from simpler to complex setting
Example: Influence function theory → semiparametric mediation
Source: IF for smooth functional of CDF
√n(T(Fₙ) - T(F)) → N(0, E[φ²])
Target: IF for mediation effect functional
Requires: mediation-specific tangent space
Mapping:
- General functional → Mediation estimand
- CDF → Joint distribution (Y,M,A,X)
- Generic IF → Mediation-specific IF
Key insight: EIF theory applies to any pathwise differentiable functional模板:将渐近理论从简单场景迁移到复杂场景
示例:影响函数理论→半参数中介分析
源方法: 针对CDF光滑泛函的IF
√n(T(Fₙ) - T(F)) → N(0, E[φ²])
目标方法: 针对中介效应泛函的IF
要求: 中介特定的切空间
映射关系:
- 通用泛函 → 中介Estimand
- CDF → 联合分布(Y,M,A,X)
- 通用IF → 中介特定IF
核心洞见: EIF理论适用于任何路径可微的泛函Pattern 4: Identification Strategy Transfer
模式4:识别策略迁移
Template: Identification approach from one causal setting to another
Example: IV from economics → Mendelian randomization
Source: Instrumental variables for demand estimation
Z → A → Y, Z ⫫ U
Target: MR for causal effects of exposures
Gene → Biomarker → Outcome
Mapping:
- Price instrument → Genetic variant
- Demand → Exposure level
- Endogeneity → Confounding
Key insight: Exogenous variation strategy is general模板:将识别方法从一个因果场景迁移到另一个
示例:IV从经济学→孟德尔随机化
源方法: 用于需求估计的工具变量
Z → A → Y, Z ⫫ U
目标方法: 用于暴露因果效应的MR
基因 → 生物标记物 → 结果
映射关系:
- 价格工具 → 遗传变异
- 需求 → 暴露水平
- 内生性 → 混杂
核心洞见: 外生变异策略具备通用性Pattern 5: Computational Method Transfer
模式5:计算方法迁移
Template: Algorithm from optimization → statistical estimation
Example: SGD from ML → online causal estimation
Source: Stochastic gradient descent for ERM
θₜ₊₁ = θₜ - ηₜ∇L(θₜ; Xₜ)
Target: Online updating for streaming causal data
Sequential estimation as data arrives
Mapping:
- Loss function → Estimating equation
- Gradient → Score contribution
- Learning rate → Weighting scheme
Key insight: Streaming updates possible for M-estimators模板:将算法从优化领域→统计估计
示例:SGD从机器学习→在线因果估计
源方法: 用于ERM的随机梯度下降
θₜ₊₁ = θₜ - ηₜ∇L(θₜ; Xₜ)
目标方法: 针对流式因果数据的在线更新
随数据到达进行序贯估计
映射关系:
- 损失函数 → 估计方程
- 梯度 → 得分贡献
- 学习率 → 加权方案
核心洞见: M估计器支持流式更新Transfer Verification Checklist
迁移验证清单
Theoretical Checks
理论检查
- Identification preserved: Estimand still identified under adapted assumptions
- Consistency maintained: Proof carries over or new proof provided
- Rate preserved: Convergence rate same or characterized
- Variance characterized: Influence function derived if applicable
- Efficiency understood: Know if/when efficient
- Identification得以保留: 在适配后的假设下,Estimand仍可识别
- 一致性得以维持: 原证明可迁移或已提供新证明
- 收敛速率得以保留: 收敛速率相同或已明确
- 方差已明确: 已推导影响函数(若适用)
- 效率已明确: 明确何时/是否具备效率
Practical Checks
实践检查
- Computable: Can actually implement the adapted method
- Stable: Numerical issues don't prevent use
- Scalable: Works at relevant data sizes
- 可计算: 能够实际实现适配后的方法
- 稳定: 数值问题不会影响使用
- 可扩展: 在相关数据规模下有效
Simulation Checks
模拟检查
- Correct at truth: Estimator unbiased when DGP matches assumptions
- Proper coverage: CIs achieve nominal coverage
- Efficiency comparison: Compared to alternatives
- Robustness: Behavior under assumption violations
- 在真实模型下准确: 当DGP符合假设时,估计器无偏
- 置信区间覆盖正确: 置信区间达到标称覆盖率
- 效率对比: 已与替代方法对比
- 鲁棒性: 已验证假设违反时的表现
Documentation Checks
文档检查
- Assumptions clear: All requirements stated
- Limitations stated: Known failure modes documented
- Guidance provided: When to use/not use
- 假设清晰: 所有要求已明确说明
- 局限性已说明: 已记录已知失效模式
- 提供指导: 已说明适用/不适用场景
Common Transfer Pitfalls
常见迁移陷阱
Pitfall 1: Hidden Assumption Dependence
陷阱1:依赖隐藏假设
Problem: Source method relies on assumption not explicit in exposition
Example: Many ML methods implicitly assume iid data
- Transfer to clustered data fails silently
- Variance underestimated, inference invalid
Prevention:
- Read proofs, not just statements
- Check what each step requires
- Simulate under violations
问题: 源方法依赖 exposition中未明确说明的假设
示例: 许多机器学习方法隐含假设数据为iid
- 迁移到聚类数据时会无声失效
- 方差被低估,推断结果无效
预防措施:
- 阅读证明,而非仅看结论
- 检查每个步骤的要求
- 在假设违反的情况下进行模拟
Pitfall 2: Changed Meaning
陷阱2:概念含义变化
Problem: Same symbol/concept means different things
Example: "Independence" in different fields
- Statistical independence: P(A,B) = P(A)P(B)
- Causal independence: No causal pathway
- Conditional independence: Given covariates
Prevention:
- Define all terms explicitly
- Verify mathematical equivalence
- Don't assume same word = same concept
问题: 相同符号/概念在不同领域含义不同
示例: 不同领域中的“独立性”
- 统计独立性: P(A,B) = P(A)P(B)
- 因果独立性: 无因果路径
- 条件独立性: 给定协变量时独立
预防措施:
- 明确定义所有术语
- 验证数学等价性
- 不要假设相同词汇代表相同概念
Pitfall 3: Lost Efficiency
陷阱3:效率损失
Problem: Method transfers but loses optimality properties
Example: MLE transferred to semiparametric setting
- Parametric MLE is efficient
- Plugging into semiparametric problem: no longer efficient
- Need to derive new efficient estimator
Prevention:
- Re-derive efficiency in target setting
- Don't assume optimality transfers
- Compare to efficiency bound
问题: 方法可迁移,但丢失了最优性特性
示例: MLE迁移到半参数场景
- 参数MLE具备效率
- 应用到半参数问题时: 不再具备效率
- 需要推导新的有效估计器
预防措施:
- 在目标场景中重新推导效率
- 不要假设最优性可迁移
- 与效率边界对比
Pitfall 4: Computational Invalidity
陷阱4:计算无效
Problem: Algorithm doesn't work in new setting
Example: Newton-Raphson for optimization
- Works when Hessian well-behaved
- In ill-conditioned problems: numerical disaster
Prevention:
- Test on representative problems
- Check condition numbers, stability
- Have fallback algorithms
问题: 算法在新场景中无法工作
示例: 用于优化的牛顿-拉夫森方法
- 在Hessian性质良好时有效
- 在病态问题中: 数值灾难
预防措施:
- 在代表性问题上测试
- 检查条件数、稳定性
- 准备备选算法
Pitfall 5: False Generalization
陷阱5:错误泛化
Problem: Transfer works for one case, claimed general
Example: Method for binary → continuous
- Test case: continuous Y is approximately binary
- Claim: works for all continuous Y
- Reality: fails for skewed/heavy-tailed
Prevention:
- Test diverse scenarios
- Characterize where it works
- State limitations clearly
问题: 迁移在某一案例中有效,但被声称具备通用性
示例: 方法从二分类→连续型
- 测试案例: 连续Y近似二分类
- 声称: 适用于所有连续Y
- 实际: 在偏态/厚尾分布下失效
预防措施:
- 在多样场景中测试
- 明确其适用范围
- 清晰说明局限性
Transfer Feasibility Assessment
迁移可行性评估
Quick Assessment Questions
快速评估问题
| Question | If No | If Yes |
|---|---|---|
| Same mathematical structure? | Major adaptation needed | Direct transfer possible |
| All assumptions translatable? | Some properties lost | Full transfer possible |
| Same data requirements? | Additional modeling needed | Straightforward application |
| Existing theory applicable? | New proofs required | Theory transfers |
| Similar computational structure? | Algorithm redesign | Code adaptation |
| 问题 | 若否 | 若是 |
|---|---|---|
| 数学结构相同? | 需要大幅适配 | 可直接迁移 |
| 所有假设可转换? | 部分特性丢失 | 可完全迁移 |
| 数据要求相同? | 需要额外建模 | 可直接应用 |
| 现有理论适用? | 需要新的证明 | 理论可迁移 |
| 计算结构相似? | 需要重新设计算法 | 可适配代码 |
Feasibility Score
可行性评分
For each dimension, score 1-5:
| Dimension | Score | Interpretation |
|---|---|---|
| Structural similarity | __ /5 | 5 = identical structure |
| Assumption compatibility | __ /5 | 5 = all assumptions transfer |
| Theoretical portability | __ /5 | 5 = proofs carry over |
| Computational similarity | __ /5 | 5 = same algorithm works |
| Value added | __ /5 | 5 = major improvement |
Total: __/25
- 20-25: Strong transfer candidate
- 15-19: Feasible with moderate effort
- 10-14: Significant adaptation required
- <10: May need different approach
针对每个维度,评分1-5:
| 维度 | 得分 | 解释 |
|---|---|---|
| 结构相似性 | __ /5 | 5 = 结构完全相同 |
| 假设兼容性 | __ /5 | 5 = 所有假设可迁移 |
| 理论可移植性 | __ /5 | 5 = 证明可直接迁移 |
| 计算相似性 | __ /5 | 5 = 相同算法可工作 |
| 价值提升 | __ /5 | 5 = 大幅优于现有方法 |
总分: __/25
- 20-25: 强迁移候选
- 15-19: 中等工作量可实现
- 10-14: 需要大量适配
- <10: 可能需要其他方法
Integration with Other Skills
与其他技能的集成
This skill works with:
- cross-disciplinary-ideation - Find candidate methods to transfer
- literature-gap-finder - Identify where transfer would be valuable
- proof-architect - Verify transferred properties
- identification-theory - Ensure identification in target setting
- asymptotic-theory - Derive properties in target setting
- simulation-architect - Validate the transfer
本技能可与以下技能配合使用:
- cross-disciplinary-ideation - 寻找可迁移的候选方法
- literature-gap-finder - 识别迁移具备价值的场景
- proof-architect - 验证迁移后的特性
- identification-theory - 确保目标场景中的可识别性
- asymptotic-theory - 推导目标场景中的特性
- simulation-architect - 验证迁移效果
Key References
关键参考文献
On Method Transfer
关于方法迁移
- Box, G.E.P. (1976). Science and statistics (on borrowing strength)
- Breiman, L. (2001). Statistical modeling: The two cultures
- Box, G.E.P. (1976). Science and statistics (论借力)
- Breiman, L. (2001). Statistical modeling: The two cultures
Successful Transfer Examples
成功迁移示例
- Rosenbaum & Rubin (1983). Central role of propensity score [survey → causal]
- Tibshirani (1996). Regression shrinkage via lasso [signals → regression]
- Robins et al. (1994). Estimation of regression coefficients [missing → causal]
- Rosenbaum & Rubin (1983). Central role of propensity score [调查→因果]
- Tibshirani (1996). Regression shrinkage via lasso [信号→回归]
- Robins et al. (1994). Estimation of regression coefficients [缺失数据→因果]
Transfer in Causal Inference
因果推断中的迁移
- Pearl, J. (2009). Causality [AI → statistics]
- Hernán & Robins (2020). Causal Inference: What If
Version: 1.0
Created: 2025-12-08
Domain: Method Development, Research Innovation
- Pearl, J. (2009). Causality [AI→统计]
- Hernán & Robins (2020). Causal Inference: What If
版本: 1.0
创建时间: 2025-12-08
领域: 方法开发、研究创新