Search Results: metrics

Found 802 Skills

dora-core

DORA (DevOps Research and Assessment) Core Model for measuring and improving software delivery performance. Use this skill to assess team performance tier, identify capability gaps, and connect delivery metrics to product release strategy.

🇺🇸|EnglishTranslated

Testing & QAprobabl-ai/skills

smoke-test-ml-pipeline

Owns the smoke test contract for an ML experiment: a small, diagnostic-by-construction pytest that fits the experiment's learner on a portion of the real `data/` source and predicts on a *disjoint* portion that deliberately carries **no pre-history buffer**. The assertion is structural — the number of predictions must equal the number of rows in the predict grid. A pipeline that loads-then-features-then-splits will silently drop the cold-start rows of the predict slice and the test will fail with a row-count mismatch; a pipeline that marks X early and references upstream history nodes from feature steps will pass trivially. The smoke test is the executable proof of the X-marker placement rule from `build-ml-pipeline`. TRIGGER when: `test-ml-pipeline` has dispatched here to write the smoke test for an approved experiment; `pytest tests/smoke/` is failing on row count; the user asks "why is the smoke test failing?"; a pipeline edit in `build-ml-pipeline` needs an executable proof; an experiment script changes the pipeline shape and the matching smoke test needs revisiting. SKIP when: the design note does not exist or is not yet approved (route to `iterate-ml-experiment`); the user is asking about a regression test or schema invariant (route to `regression-test-ml-pipeline` / `distribution-test-ml-pipeline` once those exist); the question is the *interpretation* of CV metrics, not predict-time correctness (route to `evaluate-ml-pipeline`). HOW TO USE: read the matching experiment's `journal/NN_*.md` and `experiments/NN_*.py` first to understand the pipeline's source binding (what env-dict keys does `build_learner` expect?). Then construct two env-dicts from the **real `data/` source** — a train env and a predict env — such that the predict env carries *only the rows we want predictions for* and *no pre-history buffer*. The hard assertion is that the prediction count matches the predict-env row count exactly. The soft assertion is that the smoke set's MAE is within `3 × CV_mean` (or the task-appropriate analogue). **Do not write the design note or run CV — that's other skills' job.**

🇺🇸|EnglishTranslated

Testing & QAdavila7/claude-code-templ...

senior-qa

Comprehensive QA and testing skill for quality assurance, test automation, and testing strategies for ReactJS, NextJS, NodeJS applications. Includes test suite generation, coverage analysis, E2E testing setup, and quality metrics. Use when designing test strategies, writing test cases, implementing test automation, performing manual testing, or analyzing test coverage.

🇺🇸|EnglishTranslated

3 scripts/Checked

AI & Machine Learningdavila7/claude-code-templ...

tensorboard

Visualize training metrics, debug models with histograms, compare experiments, visualize model graphs, and profile performance with TensorBoard - Google's ML visualization toolkit

🇺🇸|EnglishTranslated

Data Processingsecondsky/claude-skills

kpi-dashboard-design

Designs effective KPI dashboards with proper metric selection, visual hierarchy, and data visualization best practices. Use when building executive dashboards, creating analytics views, or presenting business metrics.

🇺🇸|EnglishTranslated

AI & Machine Learninganthropics/knowledge-work...

data-context-extractor

Generate or improve a company-specific data analysis skill by extracting tribal knowledge from analysts. BOOTSTRAP MODE - Triggers: "Create a data context skill", "Set up data analysis for our warehouse", "Help me create a skill for our database", "Generate a data skill for [company]" → Discovers schemas, asks key questions, generates initial skill with reference files ITERATION MODE - Triggers: "Add context about [domain]", "The skill needs more info about [topic]", "Update the data skill with [metrics/tables/terminology]", "Improve the [domain] reference" → Loads existing skill, asks targeted questions, appends/updates reference files Use when data analysts want Claude to understand their company's specific data warehouse, terminology, metrics definitions, and common query patterns.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesrevenuecat/revenuecat-ski...

revenuecat

RevenueCat metrics, customer data, and documentation search. Use when querying subscription analytics, MRR, churn, customers, or RevenueCat docs.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningterrylica/cc-skills

mlflow-python

MLflow experiment tracking via Python API. TRIGGERS - MLflow metrics, log backtest, experiment tracking, search runs.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learning404kidwiz/claude-supercod...

performance-monitor

Expert in observing, benchmarking, and optimizing AI agents. Specializes in token usage tracking, latency analysis, and quality evaluation metrics. Use when optimizing agent costs, measuring performance, or implementing evals. Triggers include "agent performance", "token usage", "latency optimization", "eval", "agent metrics", "cost optimization", "agent benchmarking".

🇺🇸|EnglishTranslated

Marketing & Growthanthropics/knowledge-work...

campaign-planning

Plan marketing campaigns with objectives, audience segmentation, channel strategy, content calendars, and success metrics. Use when launching a campaign, planning a product launch, building a content calendar, allocating budget across channels, or defining campaign KPIs.

🇺🇸|EnglishTranslated

Data Processingletta-ai/skills

portfolio-optimization

Guidance for implementing high-performance portfolio optimization using Python C extensions. This skill applies when tasks require optimizing financial computations (matrix operations, covariance calculations, portfolio risk metrics) by implementing C extensions for Python. Use when performance speedup requirements exist (e.g., 1.2x or greater) and the task involves numerical computations on large datasets (thousands of assets).

🇺🇸|EnglishTranslated

Data Processingdelphine-l/claude_global

bioinformatics-fundamentals

Core bioinformatics concepts including SAM/BAM format, AGP genome assembly format, sequencing technologies (Hi-C, HiFi, Illumina), quality metrics, and common data processing patterns. Essential for debugging alignment, filtering, pairing issues, and AGP coordinate validation.

🇺🇸|EnglishTranslated