Search Results: uat

Found 1,142 Skills

rate-skill

Evaluate skill quality against best practices. Use when asked to "rate this skill", "review skill quality", "check skill formatting", "is this skill good", "evaluate SKILL.md", "grade this skill", or when validating skill files before publishing.

🇺🇸|EnglishTranslated

AI & Machine Learningmajesticlabs-dev/majestic...

skill-design-philosophy

Core philosophy for designing Claude Code skills - when to use skills vs agents, the knowledge test, and what makes skills valuable. Use when deciding component type or evaluating skill quality.

🇺🇸|EnglishTranslated

Data Processingokikusan-public/stock_ski...

stock-report

Detailed report for individual stocks. Generate a financial analysis report by specifying a ticker symbol. Displays valuation, undervaluation judgment, and shareholder return ratio (dividends + share repurchases).

🇨🇳|ChineseTranslated

1 scripts/Checked

Documentation & Writingalexanderstephenthompson/...

explaining

Repeatable execution process for producing clear explanations. Covers Subject and Situational frameworks, depth scaling, and relatability tools.

🇺🇸|EnglishTranslated

AI & Machine Learningantstackio/skills

aws-bedrock-evals

Build and run LLM-as-judge evaluation pipelines using Amazon Bedrock Evaluation Jobs with pre-computed inference datasets. Use when setting up automated model evaluation, designing test scenarios, collecting pre-computed responses, configuring custom metrics, creating AWS infrastructure, running evaluation jobs, parsing results, and iterating on findings.

🇺🇸|EnglishTranslated

AI & Machine Learningwalletconnect/skills

skill-writing

Designs and writes high-quality Agent Skills (SKILL.md + optional reference files/scripts). Use when asked to create a new Skill, rewrite an existing Skill, improve Skill structure/metadata, or generate templates/evaluations for Skills.

🇺🇸|EnglishTranslated

Testing & QAdatabricks-solutions/ai-d...

skill-test

Testing framework for evaluating Databricks skills. Use when building test cases for skills, running skill evaluations, comparing skill versions, or creating ground truth datasets with the Generate-Review-Promote (GRP) pipeline. Triggers include "test skill", "evaluate skill", "skill regression", "ground truth", "GRP pipeline", "skill quality", and "skill metrics".

🇺🇸|EnglishTranslated

50 scripts/Attention

Documentation & Writingcdeistopened/opened-vault

article-titles

Write titles for blog posts, deep dives, and hub articles. 15 proven formulas + 10 Commandments evaluation. Generate 10+ options, select best through systematic criteria.

🇺🇸|EnglishTranslated

AI & Machine Learningmaragudk/evals-skills

llm-as-a-judge

Build, validate, and deploy LLM-as-Judge evaluators for automated quality assessment of LLM pipeline outputs. Use this skill whenever the user wants to: create an automated evaluator for subjective or nuanced failure modes, write a judge prompt for Pass/Fail assessment, split labeled data for judge development, measure judge alignment (TPR/TNR), estimate true success rates with bias correction, or set up CI evaluation pipelines. Also trigger when the user mentions "judge prompt", "automated eval", "LLM evaluator", "grading prompt", "alignment metrics", "true positive rate", or wants to move from manual trace review to automated evaluation. This skill covers the full lifecycle: prompt design → data splitting → iterative refinement → success rate estimation.

🇺🇸|EnglishTranslated

AI & Machine Learningwhitespectre/ai-assistant...

eval-clarity

Score assistant responses for clarity on a strict 1-5 scale, then return strict JSON only with score, rationale, and improvement suggestions. Use when the user asks to evaluate clarity, grade clarity, or critique clarity quality.

🇺🇸|EnglishTranslated

AI & Machine Learningwhitespectre/ai-assistant...

eval-relevance

Score assistant responses for relevance on a strict 1-5 scale, then return strict JSON only with score, rationale, and improvement suggestions. Use when the user asks to evaluate relevance, grade relevance, or critique topical alignment.

🇺🇸|EnglishTranslated

AI & Machine Learningwhitespectre/ai-assistant...

eval-guidance-actionability

Score assistant responses for guidance & actionability on a strict 1-5 scale, then return strict JSON only with dimension, score, rationale, and improvement suggestions. Use when the user asks to evaluate how actionable, helpful, or step-by-step a response is.

🇺🇸|EnglishTranslated