method-transfer-engine

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Method Transfer Engine

方法迁移引擎

Rigorous framework for adapting statistical methods across domains and settings
Use this skill when: adapting a method from one field to another, extending a method to a new setting, formalizing an intuitive connection between methods, or verifying that a transferred method retains its properties.

用于跨领域和场景适配统计方法的严谨框架
当你需要将一个方法从一个领域适配到另一个领域、将方法扩展到新场景、形式化方法间的直观关联,或验证迁移后的方法是否保留其特性时,使用本技能。

The Transfer Framework

迁移框架

What is Method Transfer?

什么是方法迁移?

Taking a technique that works in Setting A and adapting it to work in Setting B, while:
  • Preserving desirable theoretical properties
  • Identifying what changes are needed
  • Understanding what can and cannot transfer
将在场景A中有效的技术适配到场景B中,同时:
  • 保留理想的理论特性
  • 确定需要做出的变更
  • 明确哪些内容可以迁移、哪些不能

Transfer Quality Spectrum

迁移质量范围

Direct Application → Minor Adaptation → Major Modification → Inspired-By
      │                    │                   │                  │
   Same theory         Adjust for          Rewrite theory      New method,
   applies            new setting          for new setting     similar spirit
Direct Application → Minor Adaptation → Major Modification → Inspired-By
      │                    │                   │                  │
   Same theory         Adjust for          Rewrite theory      New method,
   applies            new setting          for new setting     similar spirit

Transfer Success Criteria

迁移成功标准

A successful transfer must:
  1. Solve the target problem - Method actually helps in new setting
  2. Preserve key properties - Consistency, efficiency, robustness transfer
  3. Have clear assumptions - Know what's required in new setting
  4. Be verifiable - Can prove/simulate that it works
  5. Add value - Better than existing approaches

一次成功的迁移必须满足:
  1. 解决目标问题 - 方法在新场景中切实有效
  2. 保留关键特性 - 一致性、效率、鲁棒性等特性得以迁移
  3. 假设清晰明确 - 明确新场景下的要求
  4. 可验证 - 能够通过证明或模拟验证其有效性
  5. 具备价值 - 优于现有方法

The 6-Phase Protocol

六阶段协议

This protocol provides a systematic approach to method transfer, covering all critical steps from source extraction through validation.
本协议提供了方法迁移的系统化流程,涵盖从源方法提取到验证的所有关键步骤。

Source Extraction

源方法提取

Goal: Extract the core mathematical and algorithmic essence of the source method
r
undefined
目标:提取源方法的核心数学和算法本质
r
undefined

Template for source method extraction

Template for source method extraction

extract_source_method <- function(method_name, reference) { list( name = method_name, estimand = "formal expression of what is estimated", estimator = "formula for the estimator", assumptions = c("A1: condition", "A2: condition"), properties = c("consistency", "asymptotic normality"), algorithm = c("Step 1: ...", "Step 2: ..."), complexity = "O(n^2) or similar" ) }
extract_source_method <- function(method_name, reference) { list( name = method_name, estimand = "formal expression of what is estimated", estimator = "formula for the estimator", assumptions = c("A1: condition", "A2: condition"), properties = c("consistency", "asymptotic normality"), algorithm = c("Step 1: ...", "Step 2: ..."), complexity = "O(n^2) or similar" ) }

Example: Extract Lasso from signal processing

Example: Extract Lasso from signal processing

lasso_extraction <- list( name = "Lasso/Basis Pursuit", field = "Signal Processing / Compressed Sensing", estimand = "argmin ||y - Xb||_2^2 + lambda * ||b||_1", key_insight = "L1 penalty induces sparsity via soft thresholding", assumptions = c("RIP condition", "Incoherence"), properties = c("Sparse solution", "Variable selection consistency") )
undefined
lasso_extraction <- list( name = "Lasso/Basis Pursuit", field = "Signal Processing / Compressed Sensing", estimand = "argmin ||y - Xb||_2^2 + lambda * ||b||_1", key_insight = "L1 penalty induces sparsity via soft thresholding", assumptions = c("RIP condition", "Incoherence"), properties = c("Sparse solution", "Variable selection consistency") )
undefined

Abstraction

抽象化

Goal: Identify the abstract mathematical structure that enables the method
r
undefined
目标:识别使方法生效的抽象数学结构
r
undefined

Abstract structure identification

Abstract structure identification

identify_abstraction <- function(source_method) { list( mathematical_structure = "e.g., M-estimation, U-statistics, kernels", core_operation = "e.g., reweighting, regularization, projection", information_used = "e.g., first moments, covariance, distributional", key_invariance = "what property makes it work", generalization_path = "how to extend beyond original setting" ) }
identify_abstraction <- function(source_method) { list( mathematical_structure = "e.g., M-estimation, U-statistics, kernels", core_operation = "e.g., reweighting, regularization, projection", information_used = "e.g., first moments, covariance, distributional", key_invariance = "what property makes it work", generalization_path = "how to extend beyond original setting" ) }

Example: Abstraction of propensity score methods

Example: Abstraction of propensity score methods

propensity_abstraction <- list( mathematical_structure = "Reweighting to balance distributions", core_operation = "Inverse probability weighting", invariance = "Balances covariate distribution across groups", generalization = "Any selection mechanism with known probabilities" )

---
propensity_abstraction <- list( mathematical_structure = "Reweighting to balance distributions", core_operation = "Inverse probability weighting", invariance = "Balances covariate distribution across groups", generalization = "Any selection mechanism with known probabilities" )

---

Phase 1: Source Method Analysis

阶段1:源方法分析

Goal: Deeply understand what you're transferring
markdown
undefined
目标:深入理解你要迁移的内容
markdown
undefined

Source Method Profile

源方法概况

Basic Information

基本信息

  • Name: [Method name]
  • Source field: [Domain/area]
  • Key reference: [Citation]
  • What it does: [One sentence]
  • 名称: [方法名称]
  • 源领域: [领域/方向]
  • 关键参考文献: [引用文献]
  • 功能: [一句话描述]

Problem Solved

解决的问题

  • Input: [What data/information goes in]
  • Output: [What estimate/inference comes out]
  • Setting: [When it applies]
  • 输入: [输入的数据/信息]
  • 输出: [输出的估计值/推断结果]
  • 适用场景: [适用的条件]

Mathematical Structure

数学结构

  • Estimand: [What it estimates, formally]
  • Estimator: [How it estimates, formula]
  • Loss/objective: [What it optimizes]
  • Estimand: [形式化描述其估计的内容]
  • Estimator: [估计方法的公式]
  • 损失/目标函数: [其优化的目标]

Assumptions Required

所需假设

  1. [Assumption 1]: [Mathematical statement]
    • Why needed: [Role in proof/method]
    • When violated: [Failure mode]
    • 必要性: [在证明/方法中的作用]
    • 违反时的情况: [失效模式]

Theoretical Properties

理论特性

  • Consistency: [When/how proved]
  • Rate: [Convergence rate]
  • Asymptotic distribution: [If known]
  • Efficiency: [Relative to what]
  • Robustness: [To what violations]
  • 一致性: [证明的时机/方式]
  • 收敛速率: [收敛速度]
  • 渐近分布: [若已知]
  • 效率: [相对基准]
  • 鲁棒性: [对哪些违反情况具备鲁棒性]

Computational Aspects

计算相关

  • Algorithm: [How implemented]
  • Complexity: [Time/space]
  • Software: [Available implementations]
undefined
  • 算法: [实现方式]
  • 复杂度: [时间/空间复杂度]
  • 软件: [可用的实现工具]
undefined

Phase 2: Target Problem Analysis

阶段2:目标问题分析

Goal: Understand where you want to apply it
markdown
undefined
目标:理解你要应用方法的场景
markdown
undefined

Target Problem Profile

目标问题概况

Basic Information

基本信息

  • Problem name: [Description]
  • Target field: [Domain/area]
  • Motivation: [Why solve this]
  • 问题名称: [描述]
  • 目标领域: [领域/方向]
  • 动机: [解决该问题的原因]

Problem Structure

问题结构

  • Data available: [What's observed]
  • Estimand: [What you want to estimate]
  • Challenges: [Why existing methods inadequate]
  • 可用数据: [可观测的数据]
  • Estimand: [你要估计的内容]
  • 挑战: [现有方法的不足]

Current Approaches

当前方法

  • Method 1: [Name, limitations]
  • Method 2: [Name, limitations]
  • Gap: [What's missing]
  • 方法1: [名称,局限性]
  • 方法2: [名称,局限性]
  • 差距: [缺失的内容]

Constraints

约束条件

  • Assumptions willing to make: [List]
  • Assumptions NOT willing to make: [List]
  • Computational constraints: [If any]
undefined
  • 愿意做出的假设: [列表]
  • 不愿做出的假设: [列表]
  • 计算约束: [若有]
undefined

Target Mapping

目标映射

Goal: Map source concepts to their target domain counterparts
r
undefined
目标:将源概念映射到目标领域的对应概念
r
undefined

Target mapping framework

Target mapping framework

create_target_mapping <- function(source, target) { mapping <- list( objects = data.frame( source = c("treatment", "outcome", "confounder"), target = c("mediator", "effect", "moderator"), relationship = c("direct", "indirect", "modifies") ), assumptions = data.frame( source_assumption = c("SUTVA", "Ignorability"), target_version = c("Consistency", "Sequential ignorability"), status = c("transfers", "needs modification") ) )
mapping }
create_target_mapping <- function(source, target) { mapping <- list( objects = data.frame( source = c("treatment", "outcome", "confounder"), target = c("mediator", "effect", "moderator"), relationship = c("direct", "indirect", "modifies") ), assumptions = data.frame( source_assumption = c("SUTVA", "Ignorability"), target_version = c("Consistency", "Sequential ignorability"), status = c("transfers", "needs modification") ) )
mapping }

Example: IV to Mendelian randomization mapping

Example: IV to Mendelian randomization mapping

iv_to_mr <- list( price_instrument = "genetic_variant", demand = "biomarker_exposure", endogeneity = "unmeasured_confounding", exclusion = "pleiotropic_effects", key_difference = "biological vs economic mechanisms" )
undefined
iv_to_mr <- list( price_instrument = "genetic_variant", demand = "biomarker_exposure", endogeneity = "unmeasured_confounding", exclusion = "pleiotropic_effects", key_difference = "biological vs economic mechanisms" )
undefined

Phase 3: Structure Mapping

阶段3:结构映射

Goal: Identify correspondences between source and target
markdown
undefined
目标:识别源与目标之间的对应关系
markdown
undefined

Structure Map

结构映射表

Object Correspondence

对象对应关系

SourceTargetNotes
[Source object 1][Target object 1][How they relate]
[Source object 2][Target object 2][How they relate]
.........
源对象目标对象说明
[源对象1][目标对象1][关联方式]
[源对象2][目标对象2][关联方式]
.........

Assumption Correspondence

假设对应关系

Source AssumptionTarget VersionStatus
[Source A1][Target A1']✓ Transfers / ✗ Fails / ? Modify
[Source A2][Target A2']...
.........
源假设目标版本状态
[源假设A1][目标假设A1']✓ 可迁移 / ✗ 不成立 / ? 需要修改
[源假设A2][目标假设A2']...
.........

What Transfers Directly

可直接迁移的内容

  • [Property 1]: Because [reason]
  • [Property 2]: Because [reason]

What Needs Modification

需要修改的内容

  • [Element 1]: From [source version] to [target version]
    • Why: [Reason for change]
    • How: [Specific modification]
    • 原因: [修改的理由]
    • 方式: [具体修改方案]

What Doesn't Transfer

不可迁移的内容

  • [Element 1]: Because [reason]
    • Impact: [What we lose]
    • Alternative: [How to address]
undefined
    • 影响: [损失的内容]
    • 替代方案: [解决方法]
undefined

Gap Analysis

差距分析

Goal: Identify what doesn't transfer and what modifications are needed
r
undefined
目标:识别无法迁移的内容以及需要的修改
r
undefined

Gap analysis framework

Gap analysis framework

analyze_transfer_gaps <- function(source, target, mapping) { gaps <- list( assumption_gaps = list( violated = c("iid assumption in clustered data"), modified = c("independence -> conditional independence"), new_required = c("mediator positivity") ),
property_gaps = list(
  lost = c("efficiency under misspecification"),
  weakened = c("convergence rate n^{-1/2} -> n^{-1/4}"),
  preserved = c("consistency", "asymptotic normality")
),

computational_gaps = list(
  new_challenges = c("non-convex optimization"),
  workarounds = c("ADMM algorithm", "approximate methods")
),

bridging_strategies = c(
  "Add regularization for new setting",
  "Derive modified variance estimator",
  "Implement robustness check"
)
)
gaps }
undefined
analyze_transfer_gaps <- function(source, target, mapping) { gaps <- list( assumption_gaps = list( violated = c("iid assumption in clustered data"), modified = c("independence -> conditional independence"), new_required = c("mediator positivity") ),
property_gaps = list(
  lost = c("efficiency under misspecification"),
  weakened = c("convergence rate n^{-1/2} -> n^{-1/4}"),
  preserved = c("consistency", "asymptotic normality")
),

computational_gaps = list(
  new_challenges = c("non-convex optimization"),
  workarounds = c("ADMM algorithm", "approximate methods")
),

bridging_strategies = c(
  "Add regularization for new setting",
  "Derive modified variance estimator",
  "Implement robustness check"
)
)
gaps }
undefined

Phase 4: Adaptation Design

阶段4:适配设计

Goal: Design the transferred method
markdown
undefined
目标:设计迁移后的方法
markdown
undefined

Adapted Method Design

适配后方法设计

Overview

概述

[One paragraph describing the adapted method]
[一段描述适配后方法的文字]

Formal Definition

形式化定义

Estimand: $$\psi = [target estimand formula]$$
Estimator: $$\hat{\psi}_n = [adapted estimator formula]$$
Algorithm:
  1. [Step 1]
  2. [Step 2]
  3. ...
Estimand: $$\psi = [target estimand formula]$$
Estimator: $$\hat{\psi}_n = [adapted estimator formula]$$
算法:
  1. [步骤1]
  2. [步骤2]
  3. ...

Modified Assumptions

修改后的假设

  1. [Assumption A1']: [New statement for target setting]
    • Analogous to: [Source assumption]
    • Modified because: [Reason]
    • 对应源假设: [源领域的假设]
    • 修改原因: [理由]

Expected Properties

预期特性

  • Consistency: [Conjecture/claim]
  • Rate: [Expected]
  • Efficiency: [Expected]
  • 一致性: [推测/声明]
  • 收敛速率: [预期速率]
  • 效率: [预期效率]

Key Differences from Source

与源方法的关键差异

undefined
undefined

Validation

验证

Goal: Systematically verify the transferred method works correctly
r
undefined
目标:系统验证迁移后的方法是否正常工作
r
undefined

Comprehensive validation framework for method transfer

Comprehensive validation framework for method transfer

validate_transfer <- function(adapted_method, n_sims = 1000) { results <- list()

1. Bias check: Is estimator unbiased at truth?

results$bias <- run_bias_simulation(adapted_method, n_sims)

2. Coverage check: Do CIs achieve nominal coverage?

results$coverage <- run_coverage_simulation(adapted_method, n_sims)

3. Efficiency check: Compare to alternatives

results$efficiency <- compare_to_alternatives(adapted_method)

4. Robustness check: Behavior under violations

results$robustness <- test_assumption_violations(adapted_method)

5. Edge cases: Extreme scenarios

results$edge_cases <- test_edge_cases(adapted_method)

Validation report

list( passed = all(sapply(results, function(x) x$passed)), details = results, recommendations = generate_recommendations(results) ) }
validate_transfer <- function(adapted_method, n_sims = 1000) { results <- list()

1. Bias check: Is estimator unbiased at truth?

results$bias <- run_bias_simulation(adapted_method, n_sims)

2. Coverage check: Do CIs achieve nominal coverage?

results$coverage <- run_coverage_simulation(adapted_method, n_sims)

3. Efficiency check: Compare to alternatives

results$efficiency <- compare_to_alternatives(adapted_method)

4. Robustness check: Behavior under violations

results$robustness <- test_assumption_violations(adapted_method)

5. Edge cases: Extreme scenarios

results$edge_cases <- test_edge_cases(adapted_method)

Validation report

list( passed = all(sapply(results, function(x) x$passed)), details = results, recommendations = generate_recommendations(results) ) }

Simulation template for validation

Simulation template for validation

run_transfer_validation <- function(n = 500, n_sims = 1000) { estimates <- replicate(n_sims, { # Generate data under true model data <- generate_dgp(n)
# Apply transferred method
est <- adapted_method(data)

c(estimate = est$point, se = est$se)
})
list( bias = mean(estimates["estimate", ]) - true_value, rmse = sqrt(mean((estimates["estimate", ] - true_value)^2)), coverage = mean(abs(estimates["estimate", ] - true_value) < 1.96 * estimates["se", ]) ) }
undefined
run_transfer_validation <- function(n = 500, n_sims = 1000) { estimates <- replicate(n_sims, { # Generate data under true model data <- generate_dgp(n)
# Apply transferred method
est <- adapted_method(data)

c(estimate = est$point, se = est$se)
})
list( bias = mean(estimates["estimate", ]) - true_value, rmse = sqrt(mean((estimates["estimate", ] - true_value)^2)), coverage = mean(abs(estimates["estimate", ] - true_value) < 1.96 * estimates["se", ]) ) }
undefined

Phase 5: Verification

阶段5:证明验证

Goal: Prove/demonstrate the transfer works
markdown
undefined
目标:证明或演示迁移有效
markdown
undefined

Verification Plan

验证计划

Theoretical Verification

理论验证

  • Consistency proof
    • Approach: [Proof strategy]
    • Key lemma: [What needs to be shown]
  • Asymptotic normality
    • Approach: [Proof strategy]
    • Influence function: [If applicable]
  • Efficiency (if claiming)
    • Approach: [Efficiency bound derivation]
  • 一致性证明
    • 方法: [证明策略]
    • 关键引理: [需要证明的内容]
  • 渐近正态性
    • 方法: [证明策略]
    • 影响函数: [若适用]
  • 效率验证(若声明)
    • 方法: [效率边界推导]

Simulation Verification

模拟验证

  • Scenario 1: [Description]
    • DGP: [Data generating process]
    • Expected result: [What should happen]
  • Scenario 2: Comparison to oracle
    • Purpose: [Verify optimality]
  • Scenario 3: Stress test
    • Purpose: [Find failure modes]
  • 场景1: [描述]
    • DGP: [数据生成过程]
    • 预期结果: [应出现的情况]
  • 场景2: 与Oracle方法对比
    • 目的: [验证最优性]
  • 场景3: 压力测试
    • 目的: [找出失效模式]

Empirical Verification

实证验证

  • Benchmark dataset: [If available]
  • Real application: [Domain]
undefined
  • 基准数据集: [若有]
  • 实际应用: [领域]
undefined

Phase 6: Documentation

阶段6:文档化

Goal: Document for publication
markdown
undefined
目标:为发表整理文档
markdown
undefined

Transfer Documentation

迁移文档

Contribution Statement

贡献声明

"We adapt [source method] from [source field] to [target setting] by [key modification]. Our adapted method [key property]. Unlike [alternative], our approach [advantage]."
"我们将[源方法]从[源领域]适配到[目标场景],主要修改为[关键变更]。适配后的方法具备[关键特性]。与[替代方法]不同,我们的方法[优势]。"

Theoretical Contribution

理论贡献

  • New result 1: [Theorem statement]
  • New result 2: [If applicable]
  • 新成果1: [定理表述]
  • 新成果2: [若有]

Methodological Contribution

方法学贡献

  • Adaptation insight: [What's novel about the transfer]
  • Practical guidance: [When to use]
  • 适配洞见: [迁移中的创新点]
  • 实践指南: [适用场景]

What We Learned

收获

  • About source method: [New understanding]
  • About target problem: [New understanding]
  • General principle: [Broader insight]

---
  • 关于源方法: [新的理解]
  • 关于目标问题: [新的理解]
  • 通用原则: [更广泛的洞见]

---

Common Transfer Patterns

常见迁移模式

Pattern 1: Estimator Family Transfer

模式1:估计器家族迁移

Template: Estimator type from one setting to another
Example: IPW from survey sampling → causal inference
Source: Horvitz-Thompson estimator
        E[Y] ≈ Σᵢ Yᵢ/πᵢ where πᵢ = P(selected)

Target: IPW for ATE
        E[Y(1)] ≈ Σᵢ Yᵢ·Aᵢ/e(Xᵢ) where e(x) = P(A=1|X=x)

Mapping:
- Selection indicator → Treatment indicator
- Selection probability → Propensity score
- Survey weights → Inverse propensity weights

Key insight: Both correct for selection bias via reweighting
模板:将某类估计器从一个场景迁移到另一个场景
示例:IPW从调查抽样→因果推断
源方法: Horvitz-Thompson估计器
        E[Y] ≈ Σᵢ Yᵢ/πᵢ 其中 πᵢ = P(被选中)

目标方法: 用于ATE的IPW
        E[Y(1)] ≈ Σᵢ Yᵢ·Aᵢ/e(Xᵢ) 其中 e(x) = P(A=1|X=x)

映射关系:
- 选择指示符 → 处理指示符
- 选择概率 → Propensity Score
- 调查权重 → 逆Propensity权重

核心洞见: 两者均通过重加权纠正选择偏差

Pattern 2: Robustness Property Transfer

模式2:鲁棒性特性迁移

Template: Robustness technique from one method to another
Example: Double robustness from missing data → causal inference
Source: Augmented IPW for missing data
        DR = IPW + Imputation - (IPW × Imputation)

Target: AIPW for causal effects
        Same structure but for counterfactual outcomes

Mapping:
- Missing indicator → Treatment indicator
- Missingness model → Propensity model
- Imputation model → Outcome model

Key insight: Product-form bias enables robustness to one misspecification
模板:将鲁棒性技术从一个方法迁移到另一个方法
示例:双重鲁棒性从缺失数据→因果推断
源方法: 用于缺失数据的增强IPW
        DR = IPW + 插补 - (IPW × 插补)

目标方法: 用于因果效应的AIPW
        结构相同,但针对反事实结果

映射关系:
- 缺失指示符 → 处理指示符
- 缺失模型 → Propensity模型
- 插补模型 → 结果模型

核心洞见: 乘积形式的偏差使其对单一模型误设具备鲁棒性

Pattern 3: Asymptotic Result Transfer

模式3:渐近结果迁移

Template: Asymptotic theory from simpler to complex setting
Example: Influence function theory → semiparametric mediation
Source: IF for smooth functional of CDF
        √n(T(Fₙ) - T(F)) → N(0, E[φ²])

Target: IF for mediation effect functional
        Requires: mediation-specific tangent space

Mapping:
- General functional → Mediation estimand
- CDF → Joint distribution (Y,M,A,X)
- Generic IF → Mediation-specific IF

Key insight: EIF theory applies to any pathwise differentiable functional
模板:将渐近理论从简单场景迁移到复杂场景
示例:影响函数理论→半参数中介分析
源方法: 针对CDF光滑泛函的IF
        √n(T(Fₙ) - T(F)) → N(0, E[φ²])

目标方法: 针对中介效应泛函的IF
        要求: 中介特定的切空间

映射关系:
- 通用泛函 → 中介Estimand
- CDF → 联合分布(Y,M,A,X)
- 通用IF → 中介特定IF

核心洞见: EIF理论适用于任何路径可微的泛函

Pattern 4: Identification Strategy Transfer

模式4:识别策略迁移

Template: Identification approach from one causal setting to another
Example: IV from economics → Mendelian randomization
Source: Instrumental variables for demand estimation
        Z → A → Y, Z ⫫ U

Target: MR for causal effects of exposures
        Gene → Biomarker → Outcome

Mapping:
- Price instrument → Genetic variant
- Demand → Exposure level
- Endogeneity → Confounding

Key insight: Exogenous variation strategy is general
模板:将识别方法从一个因果场景迁移到另一个
示例:IV从经济学→孟德尔随机化
源方法: 用于需求估计的工具变量
        Z → A → Y, Z ⫫ U

目标方法: 用于暴露因果效应的MR
        基因 → 生物标记物 → 结果

映射关系:
- 价格工具 → 遗传变异
- 需求 → 暴露水平
- 内生性 → 混杂

核心洞见: 外生变异策略具备通用性

Pattern 5: Computational Method Transfer

模式5:计算方法迁移

Template: Algorithm from optimization → statistical estimation
Example: SGD from ML → online causal estimation
Source: Stochastic gradient descent for ERM
        θₜ₊₁ = θₜ - ηₜ∇L(θₜ; Xₜ)

Target: Online updating for streaming causal data
        Sequential estimation as data arrives

Mapping:
- Loss function → Estimating equation
- Gradient → Score contribution
- Learning rate → Weighting scheme

Key insight: Streaming updates possible for M-estimators

模板:将算法从优化领域→统计估计
示例:SGD从机器学习→在线因果估计
源方法: 用于ERM的随机梯度下降
        θₜ₊₁ = θₜ - ηₜ∇L(θₜ; Xₜ)

目标方法: 针对流式因果数据的在线更新
        随数据到达进行序贯估计

映射关系:
- 损失函数 → 估计方程
- 梯度 → 得分贡献
- 学习率 → 加权方案

核心洞见: M估计器支持流式更新

Transfer Verification Checklist

迁移验证清单

Theoretical Checks

理论检查

  • Identification preserved: Estimand still identified under adapted assumptions
  • Consistency maintained: Proof carries over or new proof provided
  • Rate preserved: Convergence rate same or characterized
  • Variance characterized: Influence function derived if applicable
  • Efficiency understood: Know if/when efficient
  • Identification得以保留: 在适配后的假设下,Estimand仍可识别
  • 一致性得以维持: 原证明可迁移或已提供新证明
  • 收敛速率得以保留: 收敛速率相同或已明确
  • 方差已明确: 已推导影响函数(若适用)
  • 效率已明确: 明确何时/是否具备效率

Practical Checks

实践检查

  • Computable: Can actually implement the adapted method
  • Stable: Numerical issues don't prevent use
  • Scalable: Works at relevant data sizes
  • 可计算: 能够实际实现适配后的方法
  • 稳定: 数值问题不会影响使用
  • 可扩展: 在相关数据规模下有效

Simulation Checks

模拟检查

  • Correct at truth: Estimator unbiased when DGP matches assumptions
  • Proper coverage: CIs achieve nominal coverage
  • Efficiency comparison: Compared to alternatives
  • Robustness: Behavior under assumption violations
  • 在真实模型下准确: 当DGP符合假设时,估计器无偏
  • 置信区间覆盖正确: 置信区间达到标称覆盖率
  • 效率对比: 已与替代方法对比
  • 鲁棒性: 已验证假设违反时的表现

Documentation Checks

文档检查

  • Assumptions clear: All requirements stated
  • Limitations stated: Known failure modes documented
  • Guidance provided: When to use/not use

  • 假设清晰: 所有要求已明确说明
  • 局限性已说明: 已记录已知失效模式
  • 提供指导: 已说明适用/不适用场景

Common Transfer Pitfalls

常见迁移陷阱

Pitfall 1: Hidden Assumption Dependence

陷阱1:依赖隐藏假设

Problem: Source method relies on assumption not explicit in exposition
Example: Many ML methods implicitly assume iid data
  • Transfer to clustered data fails silently
  • Variance underestimated, inference invalid
Prevention:
  • Read proofs, not just statements
  • Check what each step requires
  • Simulate under violations
问题: 源方法依赖 exposition中未明确说明的假设
示例: 许多机器学习方法隐含假设数据为iid
  • 迁移到聚类数据时会无声失效
  • 方差被低估,推断结果无效
预防措施:
  • 阅读证明,而非仅看结论
  • 检查每个步骤的要求
  • 在假设违反的情况下进行模拟

Pitfall 2: Changed Meaning

陷阱2:概念含义变化

Problem: Same symbol/concept means different things
Example: "Independence" in different fields
  • Statistical independence: P(A,B) = P(A)P(B)
  • Causal independence: No causal pathway
  • Conditional independence: Given covariates
Prevention:
  • Define all terms explicitly
  • Verify mathematical equivalence
  • Don't assume same word = same concept
问题: 相同符号/概念在不同领域含义不同
示例: 不同领域中的“独立性”
  • 统计独立性: P(A,B) = P(A)P(B)
  • 因果独立性: 无因果路径
  • 条件独立性: 给定协变量时独立
预防措施:
  • 明确定义所有术语
  • 验证数学等价性
  • 不要假设相同词汇代表相同概念

Pitfall 3: Lost Efficiency

陷阱3:效率损失

Problem: Method transfers but loses optimality properties
Example: MLE transferred to semiparametric setting
  • Parametric MLE is efficient
  • Plugging into semiparametric problem: no longer efficient
  • Need to derive new efficient estimator
Prevention:
  • Re-derive efficiency in target setting
  • Don't assume optimality transfers
  • Compare to efficiency bound
问题: 方法可迁移,但丢失了最优性特性
示例: MLE迁移到半参数场景
  • 参数MLE具备效率
  • 应用到半参数问题时: 不再具备效率
  • 需要推导新的有效估计器
预防措施:
  • 在目标场景中重新推导效率
  • 不要假设最优性可迁移
  • 与效率边界对比

Pitfall 4: Computational Invalidity

陷阱4:计算无效

Problem: Algorithm doesn't work in new setting
Example: Newton-Raphson for optimization
  • Works when Hessian well-behaved
  • In ill-conditioned problems: numerical disaster
Prevention:
  • Test on representative problems
  • Check condition numbers, stability
  • Have fallback algorithms
问题: 算法在新场景中无法工作
示例: 用于优化的牛顿-拉夫森方法
  • 在Hessian性质良好时有效
  • 在病态问题中: 数值灾难
预防措施:
  • 在代表性问题上测试
  • 检查条件数、稳定性
  • 准备备选算法

Pitfall 5: False Generalization

陷阱5:错误泛化

Problem: Transfer works for one case, claimed general
Example: Method for binary → continuous
  • Test case: continuous Y is approximately binary
  • Claim: works for all continuous Y
  • Reality: fails for skewed/heavy-tailed
Prevention:
  • Test diverse scenarios
  • Characterize where it works
  • State limitations clearly

问题: 迁移在某一案例中有效,但被声称具备通用性
示例: 方法从二分类→连续型
  • 测试案例: 连续Y近似二分类
  • 声称: 适用于所有连续Y
  • 实际: 在偏态/厚尾分布下失效
预防措施:
  • 在多样场景中测试
  • 明确其适用范围
  • 清晰说明局限性

Transfer Feasibility Assessment

迁移可行性评估

Quick Assessment Questions

快速评估问题

QuestionIf NoIf Yes
Same mathematical structure?Major adaptation neededDirect transfer possible
All assumptions translatable?Some properties lostFull transfer possible
Same data requirements?Additional modeling neededStraightforward application
Existing theory applicable?New proofs requiredTheory transfers
Similar computational structure?Algorithm redesignCode adaptation
问题若否若是
数学结构相同?需要大幅适配可直接迁移
所有假设可转换?部分特性丢失可完全迁移
数据要求相同?需要额外建模可直接应用
现有理论适用?需要新的证明理论可迁移
计算结构相似?需要重新设计算法可适配代码

Feasibility Score

可行性评分

For each dimension, score 1-5:
DimensionScoreInterpretation
Structural similarity__ /55 = identical structure
Assumption compatibility__ /55 = all assumptions transfer
Theoretical portability__ /55 = proofs carry over
Computational similarity__ /55 = same algorithm works
Value added__ /55 = major improvement
Total: __/25
  • 20-25: Strong transfer candidate
  • 15-19: Feasible with moderate effort
  • 10-14: Significant adaptation required
  • <10: May need different approach

针对每个维度,评分1-5:
维度得分解释
结构相似性__ /55 = 结构完全相同
假设兼容性__ /55 = 所有假设可迁移
理论可移植性__ /55 = 证明可直接迁移
计算相似性__ /55 = 相同算法可工作
价值提升__ /55 = 大幅优于现有方法
总分: __/25
  • 20-25: 强迁移候选
  • 15-19: 中等工作量可实现
  • 10-14: 需要大量适配
  • <10: 可能需要其他方法

Integration with Other Skills

与其他技能的集成

This skill works with:
  • cross-disciplinary-ideation - Find candidate methods to transfer
  • literature-gap-finder - Identify where transfer would be valuable
  • proof-architect - Verify transferred properties
  • identification-theory - Ensure identification in target setting
  • asymptotic-theory - Derive properties in target setting
  • simulation-architect - Validate the transfer

本技能可与以下技能配合使用:
  • cross-disciplinary-ideation - 寻找可迁移的候选方法
  • literature-gap-finder - 识别迁移具备价值的场景
  • proof-architect - 验证迁移后的特性
  • identification-theory - 确保目标场景中的可识别性
  • asymptotic-theory - 推导目标场景中的特性
  • simulation-architect - 验证迁移效果

Key References

关键参考文献

On Method Transfer

关于方法迁移

  • Box, G.E.P. (1976). Science and statistics (on borrowing strength)
  • Breiman, L. (2001). Statistical modeling: The two cultures
  • Box, G.E.P. (1976). Science and statistics (论借力)
  • Breiman, L. (2001). Statistical modeling: The two cultures

Successful Transfer Examples

成功迁移示例

  • Rosenbaum & Rubin (1983). Central role of propensity score [survey → causal]
  • Tibshirani (1996). Regression shrinkage via lasso [signals → regression]
  • Robins et al. (1994). Estimation of regression coefficients [missing → causal]
  • Rosenbaum & Rubin (1983). Central role of propensity score [调查→因果]
  • Tibshirani (1996). Regression shrinkage via lasso [信号→回归]
  • Robins et al. (1994). Estimation of regression coefficients [缺失数据→因果]

Transfer in Causal Inference

因果推断中的迁移

  • Pearl, J. (2009). Causality [AI → statistics]
  • Hernán & Robins (2020). Causal Inference: What If

Version: 1.0 Created: 2025-12-08 Domain: Method Development, Research Innovation
  • Pearl, J. (2009). Causality [AI→统计]
  • Hernán & Robins (2020). Causal Inference: What If

版本: 1.0 创建时间: 2025-12-08 领域: 方法开发、研究创新