Loading...
Loading...
Found 1,747 Skills
Run cross-framework agent comparisons using evaluatorq from orqkit — compares any combination of agents (orq.ai, LangGraph, CrewAI, OpenAI Agents SDK, Vercel AI SDK) head-to-head on the same dataset with LLM-as-a-judge scoring. Use when comparing agents, benchmarking, or wanting side-by-side evaluation. Do NOT use when comparing only orq.ai configurations with no external agents (use run-experiment instead).
Verified corrections for IAM behaviors that AI agents frequently get wrong — policy evaluation edge cases, trust policy gotchas, STS session limits, Organizations quirks, and SAML/MFA specifics. Use alongside documentation when working with IAM roles, policies, STS, or Organizations. Do NOT use for non-IAM authorization like Cognito user-pool policies or app-level RBAC.
Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.
Comprehensive quality auditing and evaluation of tools, frameworks, and systems against industry best practices with detailed scoring across 12 critical dimensions
Analyzes events through computer science lens using computational complexity, algorithms, data structures, systems architecture, information theory, and software engineering principles to evaluate feasibility, scalability, security. Provides insights on algorithmic efficiency, system design, computational limits, data management, and technical trade-offs. Use when: Technology evaluation, system architecture, algorithm design, scalability analysis, security assessment. Evaluates: Computational complexity, algorithmic efficiency, system architecture, scalability, data integrity, security.
Evaluate UI/flows from cognitive load, error prevention, and accessibility perspectives. Apply when reviewing UX, discussing user confusion, high drop-off, or form usability issues.
Analyzes events through journalistic lens using 5 Ws and H, investigative methods, source evaluation, fact-checking, newsworthiness criteria, and ethical journalism principles. Provides insights on story angles, information gaps, credibility, public interest, and media framing. Use when: Breaking news, information verification, source analysis, story development, media criticism. Evaluates: Factual accuracy, source credibility, completeness, newsworthiness, bias, public interest.
Build and run evaluators for AI/LLM applications using Phoenix.
Retrieve historical market capitalization data for any stock using Octagon MCP. Use when tracking market cap changes over time, analyzing valuation trends, identifying peak and trough valuations, and comparing historical size classifications.
Comprehensive market analyst skill that orchestrates all Octagon stock performance and market data skills. Use when conducting stock analysis, creating market reports, evaluating valuations, comparing sectors, or performing technical and sentiment analysis.
Retrieve stock price change statistics across multiple time periods using Octagon MCP. Use when analyzing short-term and long-term returns, comparing performance across timeframes, and evaluating momentum and historical growth.
Create an AI Evals Pack (eval PRD, test set, rubric, judge plan, results + iteration loop). Use for LLM evaluation, benchmarks, rubrics, error analysis/open coding, and ship/no-ship quality gates for AI features.