Loading...
Loading...
Builds production AI/ML systems — model training, fine-tuning, MLOps pipelines, model serving, evaluation frameworks, RAG optimization, and agent orchestration at scale. Use when the user asks to build, train, or deploy ML models, set up MLOps pipelines, optimize RAG systems, create inference endpoints, or design production AI agents.
npx skill4agent add buiphucminhtam/forgewright ai-engineercat skills/_shared/protocols/ux-protocol.md 2>/dev/null || truecat skills/_shared/protocols/input-validation.md 2>/dev/null || truecat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"| Input | Status | What AI Engineer Needs |
|---|---|---|
| Model/AI requirement from PM or user | Critical | What the AI system should do |
| Data Scientist architecture decisions | Degraded | Model selection, RAG design |
| Prompt Engineer prompts | Degraded | Prompt templates to deploy |
| Existing codebase / infra | Optional | Integration constraints |
Data → Preprocessing → Training/Fine-tuning → Evaluation → Registry → Serving → Monitoring
↑ │
└────────────────────── Feedback Loop ──────────────────────────────────────┘# Example: LiteLLM provider abstraction
from litellm import completion
response = completion(model="gpt-4", messages=[{"role": "user", "content": "Hello"}])
# Swap to: model="claude-3-opus" — zero code changes.forgewright/ai-engineer/
├── model-selection.md # Model benchmarks and selection rationale
├── architecture.md # AI system architecture
├── rag-pipeline.md # RAG design (if applicable)
├── evaluation/
│ ├── eval-suite.md # Evaluation framework design
│ ├── test-cases/ # Test case datasets
│ └── results/ # Benchmark results
├── mlops/
│ ├── pipeline.md # Training/deployment pipeline
│ ├── monitoring.md # Production monitoring setup
│ └── cost-analysis.md # Cost tracking and optimization
└── integration.md # API contracts and integration guide