Search Results: benchmarking

Found 105 Skills

autoresearch-skill

[Hyper] Optimize an existing Codex skill through baseline-first experiments, binary evals, optional guards, and one-mutation-at-a-time iteration. Use for skill autoresearch, measured trigger/workflow improvement, self-optimizing a skill, benchmarking skill changes, or resuming skill experiment artifacts.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesnimbleway/agent-skills

talent-sourcing

Finds qualified candidates for a role by searching LinkedIn, Indeed, GitHub, and other professional platforms using Nimble Web Search Agents. Accepts a job description, role title, or freeform request and returns a ranked candidate list with profiles, skills, and contact signals. Use this skill when the user wants to find, source, or recruit candidates for a role. Common triggers: "find candidates for", "source engineers in", "who can I hire for", "find me a [role]", "recruiting for", "talent search", "find a [role] in [city]", "build a candidate list", "sourcing for [role]", "who's available for", "find potential hires". Also triggers on a pasted job description followed by a sourcing request. Do NOT use for job market research or salary benchmarking — use market-finder instead. Do NOT use for researching a single known person — use company-deep-dive or meeting-prep instead.

🇺🇸|EnglishTranslated

AI & Machine Learningtristanmanchester/agent-s...

jax-development

Use this skill when the user is writing, debugging, profiling, refactoring, reviewing, benchmarking, parallelising, exporting, or explaining JAX code, or when they mention JAX, jax.numpy, jit, grad, value_and_grad, vmap, scan, lax, random keys, pytrees, jax.Array, sharding, Mesh, PartitionSpec, NamedSharding, pmap, shard_map, Pallas, XLA, StableHLO, checkify, profiler, or the JAX repo. It helps turn NumPy or PyTorch-style code into pure functional JAX, fix tracer/control-flow/shape/PRNG bugs, remove recompiles and host-device syncs, choose transforms and sharding strategies, inspect jaxpr/lowering/IR, and benchmark compiled code correctly.

🇺🇸|EnglishTranslated

25 scripts/Attention

Tools & Utilitiesmanutej/luxor-claude-mark...

performance-benchmark-specialist

Performance benchmarking expertise for shell tools, covering benchmark design, statistical analysis (min/max/mean/median/stddev), performance targets (<100ms, >90% hit rate), workspace generation, and comprehensive reporting

🇺🇸|EnglishTranslated

Backend Developmenthuiali/rust-skills

rust-performance

Performance optimization expert covering profiling, benchmarking, memory allocation, SIMD, cache optimization, false sharing, lock contention, and NUMA-aware programming.

🇺🇸|EnglishTranslated

Testing & QAsentenz/skills

cpp-benchmark-testing

Automates benchmark test creation for C++ projects using Google Benchmark with consistent software testing patterns. Use when creating performance benchmarks, profiling tests, or when the user mentions benchmarking, Google Benchmark, or performance testing.

🇺🇸|EnglishTranslated

AI & Machine Learningrysweet/amplihack

eval-recipes-runner

Run Microsoft's eval-recipes benchmarks to validate amplihack improvements against baseline agents. Auto-activates when testing improvements, running evals, or benchmarking changes.

🇺🇸|EnglishTranslated

Testing & QAyonatangross/orchestkit

testing-perf

Performance and load testing patterns — k6 load tests, Locust stress tests, pytest execution optimization (xdist parallel, plugins), test type classification, and performance benchmarking. Use when writing load tests, optimizing test execution speed, or setting up pytest infrastructure.

🇺🇸|EnglishTranslated

AI & Machine Learningakillness/oh-my-skills

skill-autoresearch

Autonomously optimize an existing AI skill by running it repeatedly against binary evals, mutating one instruction at a time, and keeping only changes that improve pass rate. Based on Karpathy-style autoresearch, but applied to SKILL.md iteration instead of ML training. Use when optimizing a skill, benchmarking prompt quality, building evals for a skill, or running self-improvement loops on reusable agent instructions. Triggers on: skill-autoresearch, optimize this skill, improve this skill, benchmark this skill, eval my skill, run autoresearch on this skill, self-improve skill.

🇺🇸|EnglishTranslated

AI & Machine Learningascend/agent-skills

hccl-test

HCCL (Huawei Collective Communication Library) performance testing for Ascend NPU clusters. Use for testing distributed communication bandwidth, verifying HCCL functionality, and benchmarking collective operations like AllReduce, AllGather. Covers MPI installation, multi-node pre-flight checks (SSH/CANN version/NPU health), and production testing workflows.

🇺🇸|EnglishTranslated

5 scripts/Attention

Marketing & Growthsales-skills/sales

sales-plutoba

PlutoBa platform help — AI influencer vetting and creator due diligence across TikTok, Instagram, and YouTube. Covers PlutoBa Score (7-dimension assessment), Deep Assessments (100+ posts, 300+ comments), fake follower detection, audience authenticity, brand safety risk scoring, rate benchmarking, AI-powered creator outreach, creator CRM, and campaign briefs. Use when worried an influencer's followers are fake, need to check if a creator is brand-safe before signing a deal, want to know what to pay an influencer, PlutoBa Score seems too low or too high, creator outreach templates aren't getting responses, unsure which PlutoBa plan fits your needs, or setting up PlutoBa for an agency with multiple brands. Do NOT use for influencer strategy across platforms (use /sales-influencer-marketing) or influencer discovery and search (use /sales-hypeauditor or /sales-modash).

🇺🇸|EnglishTranslated

AI & Machine Learninggetcompanion-ai/feynman

autoresearch

Autonomous experiment loop that tries ideas, measures results, keeps what works, and discards what doesn't. Use when the user asks to optimize a metric, run an experiment loop, improve performance iteratively, or automate benchmarking.

🇺🇸|EnglishTranslated