ai-gateway

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ai-gateway

Purpose

用途

This skill manages an AI gateway for routing, securing, and monitoring AI service requests in ML operations, ensuring efficient traffic handling, API security, and performance tracking within the aimlops cluster.

该技能在机器学习运维（ML operations）中管理AI网关，确保aimlops集群内的流量高效处理、API安全防护以及性能追踪。

When to Use

适用场景

Use this skill when building ML pipelines that require centralized routing of AI requests, such as in production environments with multiple AI models, to enforce security policies, monitor traffic, or scale API endpoints. Apply it in scenarios involving microservices for AI inference or when integrating with tools like Kubernetes for aimlops workflows.

当构建需要集中路由AI请求的机器学习流水线时，例如在包含多个AI模型的生产环境中，需要实施安全策略、监控流量或扩展API端点时，可使用该技能。适用于AI推理微服务场景，或与Kubernetes等工具集成以实现aimlops工作流的场景。

Key Capabilities

核心功能

Routing: Dynamically route requests to AI services based on rules, using path-based or header-based matching.
Security: Enforce authentication, rate limiting, and encryption via JWT or API keys.
Monitoring: Track metrics like request latency and error rates through integrated logging and Prometheus exporters.
Configuration: Support YAML-based configs for defining routes, e.g., specifying source and destination endpoints.
Scalability: Handle load balancing across multiple AI backends with automatic failover.

路由：基于规则动态将请求路由至AI服务，支持基于路径或请求头的匹配方式。
安全防护：通过JWT或API密钥实施身份验证、速率限制与加密。
监控：通过集成日志与Prometheus导出器追踪请求延迟、错误率等指标。
配置：支持基于YAML的配置文件定义路由，例如指定源和目标端点。
可扩展性：支持多个AI后端的负载均衡，并具备自动故障转移能力。

Usage Patterns

使用模式

To use this skill, first set up the AI gateway via CLI or API, then define routes and security rules. Always authenticate requests using the

$AI_GATEWAY_API_KEY

environment variable. For CLI usage, initialize with

ai-gateway-cli init --config path/to/config.yaml

, then apply changes with

ai-gateway-cli apply

. In code, import the SDK and call methods like

createRoute()

for programmatic setup. Monitor usage by querying metrics endpoints periodically.

要使用该技能，首先通过CLI或API设置AI网关，然后定义路由与安全规则。请始终使用

$AI_GATEWAY_API_KEY

环境变量进行请求身份验证。对于CLI使用方式，通过

ai-gateway-cli init --config path/to/config.yaml

初始化，然后使用

ai-gateway-cli apply

应用更改。在代码中，导入SDK并调用

createRoute()

等方法以实现程序化配置。定期查询指标端点以监控使用情况。

Common Commands/API

常用命令/API

CLI Commands:

Initialize gateway:

ai-gateway-cli init --cluster aimlops --key $AI_GATEWAY_API_KEY

Add a route:

ai-gateway-cli add-route --path /predict --target http://ai-service:8080 --method POST

Secure an endpoint:

ai-gateway-cli secure --endpoint /predict --auth jwt --rate-limit 100/min

View metrics:
```
ai-gateway-cli metrics --format json
```

API Endpoints:
- Create route: POST /api/v1/routes with body
```
{ "path": "/predict", "target": "http://ai-service:8080", "method": "POST" }
```
- Update security: PUT /api/v1/security/{endpoint} with body
```
{ "authType": "jwt", "rateLimit": 100 }
```
- Get metrics: GET /api/v1/metrics?type=latency

Code Snippets:

python

import requests
headers = {'Authorization': f'Bearer {os.environ.get("AI_GATEWAY_API_KEY")}'}
response = requests.post('http://gateway:8080/api/v1/routes', json={"path": "/predict", "target": "http://ai-service:8080"}, headers=headers)

bash

export AI_GATEWAY_API_KEY=your_api_key_here
ai-gateway-cli add-route --path /chat --target http://llm-service:5000

Config Formats: Use YAML for configurations, e.g.:

routes:
  - path: /predict
    target: http://ai-service:8080
    methods: [POST]
security:
  - endpoint: /predict
    auth: jwt
    rateLimit: 100

CLI命令:

初始化网关：

ai-gateway-cli init --cluster aimlops --key $AI_GATEWAY_API_KEY

添加路由：

ai-gateway-cli add-route --path /predict --target http://ai-service:8080 --method POST

保护端点：

ai-gateway-cli secure --endpoint /predict --auth jwt --rate-limit 100/min

查看指标：
```
ai-gateway-cli metrics --format json
```

API端点:
- 创建路由：POST /api/v1/routes，请求体为
```
{ "path": "/predict", "target": "http://ai-service:8080", "method": "POST" }
```
- 更新安全配置：PUT /api/v1/security/{endpoint}，请求体为
```
{ "authType": "jwt", "rateLimit": 100 }
```
- 获取指标：GET /api/v1/metrics?type=latency

代码示例:

python

import requests
headers = {'Authorization': f'Bearer {os.environ.get("AI_GATEWAY_API_KEY")}'}
response = requests.post('http://gateway:8080/api/v1/routes', json={"path": "/predict", "target": "http://ai-service:8080"}, headers=headers)

bash

export AI_GATEWAY_API_KEY=your_api_key_here
ai-gateway-cli add-route --path /chat --target http://llm-service:5000

配置格式: 使用YAML进行配置，例如：

routes:
  - path: /predict
    target: http://ai-service:8080
    methods: [POST]
security:
  - endpoint: /predict
    auth: jwt
    rateLimit: 100

Integration Notes

集成说明

Integrate with aimlops by deploying the gateway as a sidecar or standalone service in your cluster. For Kubernetes, add annotations to pods, e.g.,

kubectl annotate pod ai-pod aimlops/gateway=true

. Use the SDK to link with other AI tools: import and initialize with

AI_Gateway(api_key=os.environ['AI_GATEWAY_API_KEY']).connect(cluster='aimlops')

. Ensure compatibility by matching tags like "ai" and "mlops". For external services, set up webhooks by configuring the gateway's callback URL in your config, e.g., add

callback: http://external-service/webhook

in YAML.

通过在集群中以边车或独立服务的方式部署网关，与aimlops集成。对于Kubernetes，可为Pod添加注解，例如

kubectl annotate pod ai-pod aimlops/gateway=true

。使用SDK与其他AI工具关联：导入并通过

AI_Gateway(api_key=os.environ['AI_GATEWAY_API_KEY']).connect(cluster='aimlops')

初始化。通过匹配"ai"和"mlops"等标签确保兼容性。对于外部服务，可在配置中设置网关的回调URL来配置Webhook，例如在YAML中添加

callback: http://external-service/webhook

。

Error Handling

错误处理

Handle errors by checking HTTP status codes from API responses; for example, 401 indicates authentication failure, so retry with

headers['Authorization'] = f'Bearer {new_key}'

. For CLI, parse output errors like "Error: Invalid route path" and correct inputs. Common issues include missing API keys—always verify

if not os.environ.get('AI_GATEWAY_API_KEY'): raise ValueError('API key required')

. Log errors using the gateway's built-in logger: enable with

ai-gateway-cli config --log-level debug

, then monitor for patterns like rate limit exceedances and implement retries with exponential backoff in code.

通过检查API响应的HTTP状态码处理错误；例如，401状态码表示身份验证失败，可通过更新

headers['Authorization'] = f'Bearer {new_key}'

后重试。对于CLI，解析输出错误信息（如"Error: Invalid route path"）并修正输入。常见问题包括缺失API密钥——请始终通过

if not os.environ.get('AI_GATEWAY_API_KEY'): raise ValueError('API key required')

进行验证。使用网关内置的日志记录器记录错误：通过

ai-gateway-cli config --log-level debug

启用，然后监控速率限制超限等模式，并在代码中实现指数退避重试机制。

Concrete Usage Examples

实际使用示例

Route AI Requests: To route prediction requests to an ML model, first export your API key, then use the CLI:

export AI_GATEWAY_API_KEY=abc123; ai-gateway-cli add-route --path /ml-predict --target http://model-service:8000

. Verify with a curl request:

curl -H "Authorization: Bearer abc123" http://gateway:8080/ml-predict -d '{"input": "data"}'

Secure and Monitor API: Secure an endpoint and monitor traffic by running:
```
ai-gateway-cli secure --endpoint /chat --auth api-key --rate-limit 50/min
```
. Then, query metrics:
```
ai-gateway-cli metrics --endpoint /chat
```
. In code, handle it as: ```python import requests; headers = {'Authorization': f'Bearer {os.environ["AI_GATEWAY_API_KEY"]}'}; requests.get('http://gateway:8080/api/v1/metrics', headers=headers)
```
undefined
```

路由AI请求: 要将预测请求路由至机器学习模型，首先导出API密钥，然后使用CLI：

export AI_GATEWAY_API_KEY=abc123; ai-gateway-cli add-route --path /ml-predict --target http://model-service:8000

。通过curl请求验证：

curl -H "Authorization: Bearer abc123" http://gateway:8080/ml-predict -d '{"input": "data"}'

。

保护并监控API: 运行以下命令保护端点并监控流量：
```
ai-gateway-cli secure --endpoint /chat --auth api-key --rate-limit 50/min
```
。然后查询指标：
```
ai-gateway-cli metrics --endpoint /chat
```
。在代码中可按如下方式处理：

python

import requests; headers = {'Authorization': f'Bearer {os.environ["AI_GATEWAY_API_KEY"]}'}; requests.get('http://gateway:8080/api/v1/metrics', headers=headers)

Graph Relationships

关联关系

Related to: aimlops (cluster), ai (tag), mlops (tag)
Depends on: authentication services for security
Used by: AI services for routing and monitoring
Integrates with: Kubernetes for deployment, Prometheus for metrics collection

相关联：aimlops（集群）、ai（标签）、mlops（标签）
依赖：用于安全防护的身份验证服务
使用者：AI服务（用于路由与监控）
集成工具：Kubernetes（用于部署）、Prometheus（用于指标收集）