Loading...
Loading...
Use when coordinating complex research tasks requiring literature synthesis, quantitative validation, or multi-source integration across researcher, calculator, synthesizer, and fact-checker skills
npx skill4agent add dangeles/claude program-officer| Decision Type | Escalate? | Examples |
|---|---|---|
| Major (Scope/Direction) | ✅ Escalate | Research question unclear, conflicting evidence requires interpretation, scope expansion needed |
| Medium (Method/Approach) | ✅ If uncertain | Which statistical test appropriate, how to resolve contradictory papers, prioritization among multiple research threads |
| Minor (Coordination) | ❌ Decide | Which specialist to invoke next, how to sequence dependent tasks, level of detail for literature search |
/specialist-name/researcher/calculator/synthesizer/fact-checkerWhile coordination not complete:
Check: Has specialist provided update?
If no update in 90+ minutes:
Intervention: Check specialist status
If specialist blocked:
Escalate or reassign
If specialist complete:
Integrate findings, invoke next specialist| Specialist | Use for | Typical Duration |
|---|---|---|
| researcher | Read papers, extract information, literature review | 1-3 hours |
| synthesizer | Compare across sources, identify themes, integrate findings | 30-60 minutes |
| calculator | Quantitative analysis, power calculations, feasibility checks | 30-60 minutes |
| fact-checker | Verify claims, validate assumptions, check citations | 15-30 minutes |
/specialist-name/researcherSkill(skill="researcher")1. /researcher - Review papers on candidate methods (1-2 hours)
2. /synthesizer - Compare methods across literature (30 min)
3. /calculator - Test methods quantitatively (45 min)
4. /fact-checker - Verify performance claims (20 min)
→ Deliverable: Validated method recommendation1. /calculator - Run power analysis, check assumptions (45 min)
2. /researcher - Find similar studies in literature (1 hour)
3. /fact-checker - Verify data meets requirements (15 min)
4. /synthesizer - Integrate evidence (30 min)
→ Deliverable: Go/no-go recommendation with justification1. /researcher - Check literature for precedent (1-2 hours)
2. /calculator - Test alternative explanations (45 min)
3. /fact-checker - Verify technical details (20 min)
4. /synthesizer - Integrate evidence across sources (45 min)
→ Deliverable: Validity assessment with confidence levelMessage specialist: "Progress update? Papers reviewed so far / calculations complete?"
Expected: Concrete progress metricIf blocked:
- Clarify task if scope unclear
- Provide additional context if needed
- Reassign if specialist wrong fit
- Escalate if requires domain interpretationIf scope expanding:
- Remind of original research question
- Prioritize most critical findings
- Escalate to domain coordinator if expansion justifiedIf conflicting evidence:
- Invoke synthesizer to integrate perspectives
- Invoke fact-checker to validate sources
- Escalate interpretation to domain coordinator14:00 - Invoke /researcher: "Review papers on single-cell normalization methods"
15:30 - Check: "Progress? Papers reviewed?"
15:32 - Researcher: "Reviewed 5 papers, found 3 candidate methods"
17:00 - Check: "Status update?"
17:05 - Researcher: "Found 8 more papers, expanding to proteomics methods too"
17:06 - INTERVENTION: "Original scope: single-cell RNA-seq. Stick to that domain."
17:45 - Researcher complete: 12 papers reviewed, 3 methods identified
17:50 - Invoke /synthesizer: "Compare scran, SCTransform, Pearson residuals"**Progress Check**: [Specialist Name]
**Task**: [Original task assigned]
**Time elapsed**: [X minutes/hours]
**Expected completion**: [Original estimate]
**Questions**:
1. Current progress? (concrete metric: papers read, calculations done)
2. Blockers or uncertainties?
3. Estimated time remaining?
**Next action based on response**:
- On track → Continue, check again in 60-90 min
- Blocked → Clarify/reassign/escalate
- Scope expanding → Refocus or escalate
- Nearly done → Prepare next specialist# Research Coordination Report: [Task]
**Coordinated**: [Date and time range]
**Specialists involved**: [List]
## Recommendation
[Clear, actionable recommendation]
## Supporting Evidence
**Literature**: [Key findings from researcher]
- Papers reviewed: X
- Key citations: [list]
- Consensus: [what most papers agree on]
**Quantitative**: [Key results from calculator]
- Analysis performed: [method]
- Key finding: [numerical result]
- Interpretation: [what it means for feasibility]
**Validation**: [Key confirmations from fact-checker]
- Claims verified: [list]
- Assumptions checked: [list]
- Issues identified: [if any]
**Synthesis**: [Integrated perspective from synthesizer]
- Cross-source themes: [patterns]
- Contradictions resolved: [how]
- Confidence drivers: [what increases/decreases confidence]
## Confidence Level
[HIGH / MEDIUM / LOW]
**Justification**:
- HIGH if: Multiple independent sources converge, quantitative validation passes, no major caveats
- MEDIUM if: Some contradictions, limited data, minor caveats
- LOW if: Conflicting evidence, insufficient data, major assumptions
## Alternative Options
[If primary recommendation fails or has constraints]
1. [Alternative 1]: [brief rationale]
2. [Alternative 2]: [brief rationale]
## Implementation Notes
[What domain coordinator needs to know for implementation]
- Required inputs: [data, parameters, etc.]
- Expected outputs: [format, interpretation]
- Caveats: [limitations, assumptions]
- Validation steps: [how to verify implementation]
## Timeline Summary
- Literature review: [duration]
- Quantitative analysis: [duration]
- Validation: [duration]
- Synthesis: [duration]
- Total: [X hours Y minutes]14:00 - PI delegates: "Research normalization methods for sparse single-cell data"
14:05 - Program Officer assesses: Need researcher + synthesizer + calculator + fact-checker
14:10 - /researcher: "Review papers on sparse single-cell normalization (last 3 years)"
16:30 - Researcher complete: 12 papers, 3 methods (scran, SCTransform, Pearson residuals)
16:35 - /synthesizer: "Compare scran vs SCTransform vs Pearson residuals from literature"
17:15 - Synthesizer complete: scran most cited, SCTransform for non-UMI
17:20 - /calculator: "Test scran vs SCTransform on example sparse dataset"
18:00 - Calculator complete: scran 15% better for sparsity >80%
18:05 - /fact-checker: "Verify scran implementation requirements and assumptions"
18:20 - Fact-checker complete: Assumptions met, validated
18:25 - Program Officer integrates findings
18:30 - Deliver to PI: "Recommendation: scran for sparse UMI data (literature + validation)"
18:35 - PI interprets and writes methods section14:05 - /researcher: "Review recent papers (2020-2024) on single-cell clustering algorithms, focus on Louvain vs Leiden"
15:30 - Progress check: "Papers reviewed so far?"
15:32 - Researcher: "Found 8 papers, clear preference for Leiden"
16:15 - Researcher complete: 12 papers reviewed, Leiden preferred in 80%
16:20 - /synthesizer: "Compare Louvain vs Leiden based on literature findings"
16:50 - Synthesizer complete: Leiden advantages documented
16:55 - /calculator: "Test Leiden vs Louvain on sample dataset, compare stability"
17:40 - Calculator complete: Leiden 12% more stable
17:45 - /fact-checker: "Verify performance claims on our data type"
18:00 - Fact-checker complete: Claims verified
18:05 - Integrate findings# Research Coordination Report: Clustering Algorithm Selection
## Recommendation
**Use Leiden algorithm** with resolution=0.8
## Supporting Evidence
**Literature**:
- Papers reviewed: 12 (2020-2024)
- Leiden preferred: 10/12 papers (83%)
- Key advantage: Better handles resolution limit problem
- Citations: Traag 2019 (Leiden paper), multiple benchmarks
**Quantitative**:
- Tested on sample dataset (5000 cells)
- Leiden: 12% more stable clusters (ARI=0.89 across runs)
- Louvain: More variable (ARI=0.76 across runs)
**Validation**:
- Claims verified on our data type (sparse UMI counts)
- Computational cost similar (Leiden 5% slower, negligible)
**Synthesis**:
- Strong consensus in literature
- Quantitative testing confirms literature claims
- No significant downsides identified
## Confidence Level
**HIGH**
- Multiple independent sources converge
- Quantitative validation passes
- No major caveats
## Alternative Options
1. **Louvain**: If legacy pipeline compatibility required (slight stability loss acceptable)
2. **Hierarchical**: If deterministic results critical (slower, less resolution flexibility)
## Implementation Notes
- Use leidenalg package (Python) or Seurat (R)
- Set resolution=0.8 as starting point (tune based on cluster count)
- Run multiple iterations, check stability
- Document random seed for reproducibility
## Timeline Summary
- Literature review: 2h 10min
- Synthesis: 30min
- Quantitative testing: 45min
- Validation: 15min
- **Total: 3h 40min**/calculator: "Power analysis for mixed-effects model with n=4 batches, 20 samples"
/calculator: "Check mixed-effects assumptions on sample data (normality, homoscedasticity)"
/researcher: "Find papers using mixed-effects for similar bulk RNA-seq batch correction"
/fact-checker: "Verify our data structure meets mixed-effects requirements (balanced design, batch variation)"## Recommendation
**Proceed with mixed-effects model** (batch as random effect)
## Supporting Evidence
**Quantitative**:
- Power adequate (0.85 for 2-fold changes)
- Assumptions met: residuals normal, variance homogeneous
- Batch explains 15% variance (substantial but not excessive)
**Literature**:
- Used successfully in 3 similar studies (Leek 2014, Ritchie 2015, Johnson 2007)
- Standard approach for known batch effects
- DESeq2 implementation validates well
**Validation**:
- Data structure appropriate: 4 batches, balanced design
- No confounding between batch and condition
- Batch effect visible in PCA (PC2, 15% variance)
## Confidence Level
**HIGH** - Method appropriate, assumptions met, literature precedent strong
## Alternative Options
1. **ComBat**: If batch effect more severe (>30% variance), but loses count distribution
2. **Batch as fixed effect**: If only interested in specific batches, loses generalizability/researcher: "Check literature for similar unexpected upregulation of housekeeping genes"
/calculator: "Test alternative explanations (normalization artifact, batch effect, outlier contamination)"
/fact-checker: "Verify preprocessing steps (QC thresholds, filtering, normalization method)"
/synthesizer: "Integrate evidence - is this real biology or technical artifact?"## Recommendation
**Finding is likely real, not artifact** - report as novel with caveats
## Supporting Evidence
**Literature**:
- Rare but precedented in hypoxia conditions (2 papers: Smith 2019, Jones 2021)
- Housekeeping genes not truly "housekeeping" under stress
- Context-specific regulation documented
**Quantitative**:
- Robust across multiple normalization methods (DESeq2, TMM, CPM)
- Not driven by outliers (consistent across all replicates)
- Not batch effect (no correlation with batch)
- Validated with alternative statistical tests (Wilcoxon, t-test agree)
**Validation**:
- QC checks pass (no low-quality samples)
- Preprocessing appropriate (standard pipeline)
- Raw counts examined (not normalization artifact)
**Synthesis**:
- Literature provides biological precedent (stress response)
- Quantitative testing rules out technical artifacts
- Multiple independent lines of evidence support real biology
## Confidence Level
**MEDIUM-HIGH**
- High: Technical artifacts ruled out
- Medium: Limited biological precedent (only 2 similar papers)
- Caveat: Mechanism unclear, warrants follow-up validation
## Implementation Notes
**Report as novel finding with appropriate caveats**:
- Acknowledge limited precedent
- Suggest validation experiments (qPCR, Western blot)
- Frame as hypothesis-generating
- Note potential stress response mechanism