Loading...
Loading...
Deployment procedures and CI/CD pipeline configuration for Python/React projects. Use when deploying to staging or production, creating CI/CD pipelines with GitHub Actions, troubleshooting deployment failures, or planning rollbacks. Covers pipeline stages (build/test/staging/production), environment promotion, pre-deployment validation, health checks, canary deployment, rollback procedures, and GitHub Actions workflows. Does NOT cover Docker image building (use docker-best-practices) or incident response (use incident-response).
npx skill4agent add hieutrtr/ai1-skills deployment-pipelinedeployment-report.mddocker-best-practicesincident-responsemonitoring-setup┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐
│ BUILD │───>│ TEST │───>│ STAGING │───>│ PRODUCTION │
│ │ │ │ │ │ │ │
│ • Lint │ │ • Unit │ │ • Deploy │ │ • Canary 10% │
│ • Build │ │ • Integ │ │ • Smoke │ │ • Monitor │
│ • Image │ │ • E2E │ │ • QA │ │ • Full 100% │
└──────────┘ └──────────┘ └──────────┘ └──────────────┘
Gate: Gate: Gate: Gate:
Build pass Tests pass Smoke pass Health checks
No lint err Coverage ≥80% Manual approve Error rate <1%ruff checkruff format --checkeslintprettier --checkmypytsc --noEmit# GitHub Actions build stage
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Lint Python
run: ruff check src/ && ruff format --check src/
- name: Type check Python
run: mypy src/
- name: Build backend image
run: docker build -t app-backend:${{ github.sha }} -f Dockerfile.backend .
- name: Build frontend
run: npm ci && npm run build
- name: Build frontend image
run: docker build -t app-frontend:${{ github.sha }} -f Dockerfile.frontend .pytest tests/unit/ -v --cov=src --cov-report=xmlpytest tests/integration/ -vnpm test -- --coveragenpx playwright testpip-auditnpm audit# GitHub Actions test stage
test:
needs: build
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16
env:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: testpass
ports: ['5432:5432']
redis:
image: redis:7-alpine
ports: ['6379:6379']
steps:
- uses: actions/checkout@v4
- name: Run unit tests
run: pytest tests/unit/ -v --cov=src --cov-report=xml
- name: Run integration tests
run: pytest tests/integration/ -v
env:
DATABASE_URL: postgresql://postgres:testpass@localhost:5432/testdb
- name: Check coverage threshold
run: coverage report --fail-under=80scripts/migration-dry-run.shscripts/smoke-test.shscripts/health-check.pyCanary Timeline:
0 min 10 min 15 min 20 min
|--------|--------|--------|
10% Check 50% 100%
Deploy Metrics Ramp Full
OK? Up Rollout
|
No -> Rollback immediatelyscripts/deploy.sh --validate-only# Verify migrations are consistent
alembic check
# Verify no pending migrations
alembic heads --verbose
# Test migration against staging clone
./skills/deployment-pipeline/scripts/migration-dry-run.sh \
--db-url "$STAGING_DB_URL" \
--output-dir ./deploy-validation/
# Verify all dependencies are pinned
pip-compile --dry-run requirements.in# Verify build succeeds
npm run build
# Check bundle size limits
npx bundlesize
# Verify environment variables are set
node -e "const vars = ['REACT_APP_API_URL']; vars.forEach(v => { if(!process.env[v]) throw new Error(v + ' not set') })"| Aspect | Development | Staging | Production |
|---|---|---|---|
| Deploy trigger | Push to | Manual or auto after tests | Manual approval required |
| Database | Local PostgreSQL | Staging PostgreSQL | Production PostgreSQL (RDS) |
| Secrets | | GitHub Secrets | AWS Secrets Manager |
| Log level | DEBUG | INFO | WARNING |
| Feature flags | All enabled | Per-feature | Gradual rollout |
| SSL | Self-signed | ACM cert | ACM cert |
| Replicas | 1 | 2 | 3+ (auto-scaled) |
# FastAPI health check endpoints
@router.get("/health")
async def health():
"""Basic liveness check -- returns 200 if process is running."""
return {"status": "healthy", "timestamp": datetime.utcnow().isoformat()}
@router.get("/health/ready")
async def readiness(db: AsyncSession = Depends(get_db)):
"""Readiness check -- verifies all dependencies are accessible."""
checks = {}
# Database
try:
await db.execute(text("SELECT 1"))
checks["database"] = "ok"
except Exception as e:
checks["database"] = f"error: {str(e)}"
# Redis
try:
await redis.ping()
checks["redis"] = "ok"
except Exception as e:
checks["redis"] = f"error: {str(e)}"
all_ok = all(v == "ok" for v in checks.values())
return JSONResponse(
status_code=200 if all_ok else 503,
content={"status": "ready" if all_ok else "not_ready", "checks": checks}
)After deploy:
Wait 10s -> Check /health (liveness)
Wait 5s -> Check /health/ready (readiness)
Wait 5s -> Check /health/ready again (stability)
All pass -> Deployment successful
Any fail -> Trigger rollbackscripts/health-check.pypython scripts/health-check.py \
--url https://staging.example.com \
--retries 3 \
--timeout 30 \
--output-dir ./health-results/references/rollback-runbook.md# Roll back to previous version
./skills/deployment-pipeline/scripts/deploy.sh \
--rollback \
--version "$PREVIOUS_VERSION" \
--output-dir ./rollback-results/| Signal | Action | Timeline |
|---|---|---|
| Error rate > 5% | Automatic rollback | Immediate |
| p99 latency > 2x baseline | Automatic rollback | Immediate |
| Health check failures | Automatic rollback | After 2 retries |
| User-reported issues | Manual rollback decision | Within 15 minutes |
| Data inconsistency | Stop traffic, investigate | Immediate |
alembic downgrade.github/workflows/deploy.ymlreferences/github-actions-template.yml# Key sections of the workflow
on:
push:
branches: [main]
workflow_dispatch:
inputs:
environment:
type: choice
options: [staging, production]
concurrency:
group: deploy-${{ github.ref }}
cancel-in-progress: true
jobs:
build: # Stage 1
test: # Stage 2 (needs: build)
staging: # Stage 3 (needs: test)
production: # Stage 4 (needs: staging, manual approval)# nginx canary configuration
upstream backend {
server backend-stable:8000 weight=9; # 90% to stable
server backend-canary:8000 weight=1; # 10% to canary
}# Canary health evaluation
def evaluate_canary(metrics: dict) -> bool:
"""Return True if canary is healthy enough to proceed."""
checks = [
metrics["error_rate"] < 0.01, # < 1% error rate
metrics["p99_latency_ms"] < 500, # p99 under 500ms
metrics["memory_usage_pct"] < 85, # Memory under 85%
metrics["cpu_usage_pct"] < 75, # CPU under 75%
metrics["successful_health_checks"] >= 3, # 3+ consecutive passes
]
return all(checks)| Script | Purpose | Usage |
|---|---|---|
| Main deployment orchestration | |
| Post-deployment smoke tests | |
| Health endpoint validation | |
| Test migrations safely | |
./skills/deployment-pipeline/scripts/deploy.sh \
--env staging \
--version $(git rev-parse --short HEAD) \
--output-dir ./deploy-results/./skills/deployment-pipeline/scripts/deploy.sh \
--env production \
--version $(git rev-parse --short HEAD) \
--canary \
--output-dir ./deploy-results/./skills/deployment-pipeline/scripts/smoke-test.sh \
--url https://staging.example.com \
--output-dir ./smoke-results/./skills/deployment-pipeline/scripts/deploy.sh \
--rollback \
--env production \
--version $PREVIOUS_SHA \
--output-dir ./rollback-results/deployment-report.md# Deployment Report
## Summary
- **Environment:** staging | production
- **Version:** abc1234 (git SHA)
- **Status:** SUCCESS | FAILED | ROLLED_BACK
- **Timestamp:** 2024-01-15T14:30:00Z
- **Duration:** 12 minutes
## Pipeline Stages
| Stage | Status | Duration | Notes |
|-------|--------|----------|-------|
| Build | PASS | 3m | Image built: app:abc1234 |
| Test | PASS | 5m | 142 tests, 85% coverage |
| Staging | PASS | 2m | Smoke tests passed |
| Production | PASS | 2m | Canary 10% → 50% → 100% |
## Health Checks
- `/health` — 200 OK (12ms)
- `/health/ready` — 200 OK (45ms)
## Rollback Instructions
If issues occur, run:
\`\`\`bash
./scripts/deploy.sh --rollback --env production --version $PREV_SHA
\`\`\`
Previous version: def5678
## Next Steps
- Run `/monitoring-setup` to verify alerts are configured
- Run `/incident-response` if errors occur