Search Results: llm

Found 1,066 Skills

repomix

Repository packaging for AI/LLM analysis. Capabilities: pack repos into single files, generate AI-friendly context, codebase snapshots, security audit prep, filter/exclude patterns, token counting, multiple output formats. Actions: pack, generate, export, analyze repositories for LLMs. Keywords: Repomix, repository packaging, LLM context, AI analysis, codebase snapshot, Claude context, ChatGPT context, Gemini context, code packaging, token count, file filtering, security audit, third-party library analysis, context window, single file output. Use when: packaging codebases for AI, generating LLM context, creating codebase snapshots, analyzing third-party libraries, preparing security audits, feeding repos to Claude/ChatGPT/Gemini.

🇺🇸|EnglishTranslated

2 scripts/Checked

Marketing & Growthlisbeth718/pseo-skills

pseo-llm-visibility

Optimize programmatic SEO pages for visibility and citation in AI-generated answers from ChatGPT, Perplexity, Google AI Overviews, and other LLM-powered search. Use when optimizing for LLM citation, implementing llms.txt, configuring AI crawler access, structuring content for AI extraction, or when the user asks about generative engine optimization (GEO), AI search visibility, or getting cited by AI.

🇺🇸|EnglishTranslated

AI & Machine Learningpersonamanagmentlayer/pcl

ai-engineer-expert

Expert-level AI implementation, deployment, LLM integration, and production AI systems

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

axolotl

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

quantizing-models-bitsandbytes

Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4, FP4 formats, QLoRA training, and 8-bit optimizers. Works with HuggingFace Transformers.

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

langsmith-observability

LLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.

🇺🇸|EnglishTranslated

AI & Machine Learningromiluz13/cc10x

plan-review-gate

Inline adversarial plan review — 3 sequential checks (Feasibility, Completeness, Scope & Alignment) performed by the calling LLM in its own context. No subagents spawned. Call after saving a plan. Returns GATE_PASS or GATE_FAIL with blocking issues.

🇺🇸|EnglishTranslated

AI & Machine Learninghamelsmu/evals-skills

write-judge-prompt

Design LLM-as-Judge evaluators for subjective criteria that code-based checks cannot handle. Use when a failure mode requires interpretation (tone, faithfulness, relevance, completeness). Do NOT use when the failure mode can be checked with code (regex, schema validation, execution tests). Do NOT use when you need to validate or calibrate the judge — use validate-evaluator instead.

🇺🇸|EnglishTranslated

AI & Machine Learninghamelsmu/evals-skills

build-review-interface

Build a custom browser-based annotation interface tailored to your data for reviewing LLM traces and collecting structured feedback. Use when you need to build an annotation tool, review traces, or collect human labels.

🇺🇸|EnglishTranslated

AI & Machine Learninghamelsmu/evals-skills

validate-evaluator

Calibrate an LLM judge against human labels using data splits, TPR/TNR, and bias correction. Use after writing a judge prompt (write-judge-prompt) when you need to verify alignment before trusting its outputs. Do NOT use for code-based evaluators (those are deterministic; test with standard unit tests).

🇺🇸|EnglishTranslated

AI & Machine Learninghamelsmu/evals-skills

generate-synthetic-data

Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.

🇺🇸|EnglishTranslated

AI & Machine Learninghamelsmu/evals-skills

eval-audit

Audit an LLM eval pipeline and surface problems: missing error analysis, unvalidated judges, vanity metrics, etc. Use when inheriting an eval system, when unsure whether evals are trustworthy, or as a starting point when no eval infrastructure exists. Do NOT use when the goal is to build a new evaluator from scratch (use error-analysis, write-judge-prompt, or validate-evaluator instead).

🇺🇸|EnglishTranslated