bedrock
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAWS Bedrock
AWS Bedrock
Amazon Bedrock provides access to foundation models (FMs) from AI companies through a unified API. Build generative AI applications with text generation, embeddings, and image generation capabilities.
Amazon Bedrock通过统一API提供来自多家AI公司的基础模型(FM)访问权限。您可以借助它构建具备文本生成、嵌入向量生成和图像生成能力的生成式AI应用。
Table of Contents
目录
Core Concepts
核心概念
Foundation Models
基础模型
Pre-trained models available through Bedrock:
- Claude (Anthropic): Text generation, analysis, coding
- Titan (Amazon): Text, embeddings, image generation
- Llama (Meta): Open-weight text generation
- Mistral: Efficient text generation
- Stable Diffusion (Stability AI): Image generation
Bedrock提供的预训练模型包括:
- Claude(Anthropic):文本生成、分析、代码编写
- Titan(Amazon):文本生成、嵌入向量生成、图像生成
- Llama(Meta):开源权重文本生成
- Mistral:高效文本生成
- Stable Diffusion(Stability AI):图像生成
Model Access
模型访问权限
Models must be enabled in your account before use:
- Request access in Bedrock console
- Some models require acceptance of EULAs
- Access is region-specific
使用模型前需在您的账户中启用:
- 在Bedrock控制台申请访问权限
- 部分模型需要接受最终用户许可协议(EULA)
- 访问权限具有地域特异性
Inference Types
推理类型
| Type | Use Case | Pricing |
|---|---|---|
| On-Demand | Variable workloads | Per token |
| Provisioned Throughput | Consistent high-volume | Hourly commitment |
| Batch Inference | Async large-scale | Discounted per token |
| 类型 | 适用场景 | 定价方式 |
|---|---|---|
| 按需推理 | 可变负载场景 | 按token计费 |
| 预置吞吐量 | 持续高负载场景 | 按小时承诺计费 |
| 批量推理 | 异步大规模处理 | 优惠的按token计费 |
Common Patterns
常见模式
Invoke Model (Text Generation)
调用模型(文本生成)
AWS CLI:
bash
undefinedAWS CLI:
bash
undefinedInvoke Claude
调用Claude
aws bedrock-runtime invoke-model
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--content-type application/json
--accept application/json
--body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Explain AWS Lambda in 3 sentences."} ] }'
response.json
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--content-type application/json
--accept application/json
--body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Explain AWS Lambda in 3 sentences."} ] }'
response.json
cat response.json | jq -r '.content[0].text'
**boto3:**
```python
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
def invoke_claude(prompt, max_tokens=1024):
response = bedrock.invoke_model(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
'anthropic_version': 'bedrock-2023-05-31',
'max_tokens': max_tokens,
'messages': [
{'role': 'user', 'content': prompt}
]
})
)
result = json.loads(response['body'].read())
return result['content'][0]['text']aws bedrock-runtime invoke-model
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--content-type application/json
--accept application/json
--body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "用3句话解释AWS Lambda。"} ] }'
response.json
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--content-type application/json
--accept application/json
--body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "用3句话解释AWS Lambda。"} ] }'
response.json
cat response.json | jq -r '.content[0].text'
**boto3:**
```python
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
def invoke_claude(prompt, max_tokens=1024):
response = bedrock.invoke_model(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
'anthropic_version': 'bedrock-2023-05-31',
'max_tokens': max_tokens,
'messages': [
{'role': 'user', 'content': prompt}
]
})
)
result = json.loads(response['body'].read())
return result['content'][0]['text']Usage
使用示例
response = invoke_claude('What is Amazon S3?')
print(response)
undefinedresponse = invoke_claude('什么是Amazon S3?')
print(response)
undefinedStreaming Response
流式响应
python
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
def stream_claude(prompt):
response = bedrock.invoke_model_with_response_stream(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
'anthropic_version': 'bedrock-2023-05-31',
'max_tokens': 1024,
'messages': [
{'role': 'user', 'content': prompt}
]
})
)
for event in response['body']:
chunk = json.loads(event['chunk']['bytes'])
if chunk['type'] == 'content_block_delta':
yield chunk['delta'].get('text', '')python
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
def stream_claude(prompt):
response = bedrock.invoke_model_with_response_stream(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
'anthropic_version': 'bedrock-2023-05-31',
'max_tokens': 1024,
'messages': [
{'role': 'user', 'content': prompt}
]
})
)
for event in response['body']:
chunk = json.loads(event['chunk']['bytes'])
if chunk['type'] == 'content_block_delta':
yield chunk['delta'].get('text', '')Usage
使用示例
for text in stream_claude('Write a haiku about cloud computing.'):
print(text, end='', flush=True)
undefinedfor text in stream_claude('写一首关于云计算的俳句。'):
print(text, end='', flush=True)
undefinedGenerate Embeddings
生成嵌入向量
python
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
def get_embedding(text):
response = bedrock.invoke_model(
modelId='amazon.titan-embed-text-v2:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
'inputText': text,
'dimensions': 1024,
'normalize': True
})
)
result = json.loads(response['body'].read())
return result['embedding']python
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
def get_embedding(text):
response = bedrock.invoke_model(
modelId='amazon.titan-embed-text-v2:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
'inputText': text,
'dimensions': 1024,
'normalize': True
})
)
result = json.loads(response['body'].read())
return result['embedding']Usage
使用示例
embedding = get_embedding('AWS Lambda is a serverless compute service.')
print(f'Embedding dimension: {len(embedding)}')
undefinedembedding = get_embedding('AWS Lambda是一种无服务器计算服务。')
print(f'嵌入向量维度: {len(embedding)}')
undefinedConversation with History
带历史记录的对话
python
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
class Conversation:
def __init__(self, system_prompt=None):
self.messages = []
self.system = system_prompt
def chat(self, user_message):
self.messages.append({
'role': 'user',
'content': user_message
})
body = {
'anthropic_version': 'bedrock-2023-05-31',
'max_tokens': 1024,
'messages': self.messages
}
if self.system:
body['system'] = self.system
response = bedrock.invoke_model(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
contentType='application/json',
accept='application/json',
body=json.dumps(body)
)
result = json.loads(response['body'].read())
assistant_message = result['content'][0]['text']
self.messages.append({
'role': 'assistant',
'content': assistant_message
})
return assistant_messagepython
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
class Conversation:
def __init__(self, system_prompt=None):
self.messages = []
self.system = system_prompt
def chat(self, user_message):
self.messages.append({
'role': 'user',
'content': user_message
})
body = {
'anthropic_version': 'bedrock-2023-05-31',
'max_tokens': 1024,
'messages': self.messages
}
if self.system:
body['system'] = self.system
response = bedrock.invoke_model(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
contentType='application/json',
accept='application/json',
body=json.dumps(body)
)
result = json.loads(response['body'].read())
assistant_message = result['content'][0]['text']
self.messages.append({
'role': 'assistant',
'content': assistant_message
})
return assistant_messageUsage
使用示例
conv = Conversation(system_prompt='You are an AWS solutions architect.')
print(conv.chat('What database should I use for a chat application?'))
print(conv.chat('What about for time-series data?'))
undefinedconv = Conversation(system_prompt='您是一位AWS解决方案架构师。')
print(conv.chat('我应该为聊天应用选择哪种数据库?'))
print(conv.chat('那时间序列数据呢?'))
undefinedList Available Models
列出可用模型
bash
undefinedbash
undefinedList all foundation models
列出所有基础模型
aws bedrock list-foundation-models
--query 'modelSummaries[*].[modelId,modelName,providerName]'
--output table
--query 'modelSummaries[*].[modelId,modelName,providerName]'
--output table
aws bedrock list-foundation-models
--query 'modelSummaries[*].[modelId,modelName,providerName]'
--output table
--query 'modelSummaries[*].[modelId,modelName,providerName]'
--output table
Filter by provider
按供应商筛选
aws bedrock list-foundation-models
--by-provider anthropic
--query 'modelSummaries[*].modelId'
--by-provider anthropic
--query 'modelSummaries[*].modelId'
aws bedrock list-foundation-models
--by-provider anthropic
--query 'modelSummaries[*].modelId'
--by-provider anthropic
--query 'modelSummaries[*].modelId'
Get model details
获取模型详情
aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
undefinedaws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
undefinedRequest Model Access
申请模型访问权限
bash
undefinedbash
undefinedList model access status
列出模型访问权限状态
aws bedrock list-foundation-model-agreement-offers
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--model-id anthropic.claude-3-sonnet-20240229-v1:0
undefinedaws bedrock list-foundation-model-agreement-offers
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--model-id anthropic.claude-3-sonnet-20240229-v1:0
undefinedCLI Reference
CLI参考
Bedrock (Control Plane)
Bedrock(控制平面)
| Command | Description |
|---|---|
| List available models |
| Get model details |
| List fine-tuned models |
| Start fine-tuning |
| List provisioned capacity |
| 命令 | 描述 |
|---|---|
| 列出可用模型 |
| 获取模型详情 |
| 列出微调后的模型 |
| 启动微调任务 |
| 列出预置容量 |
Bedrock Runtime (Data Plane)
Bedrock Runtime(数据平面)
| Command | Description |
|---|---|
| Invoke model synchronously |
| Invoke with streaming |
| Multi-turn conversation API |
| Streaming conversation |
| 命令 | 描述 |
|---|---|
| 同步调用模型 |
| 流式调用模型 |
| 多轮对话API |
| 流式对话 |
Bedrock Agent Runtime
Bedrock Agent Runtime
| Command | Description |
|---|---|
| Invoke a Bedrock agent |
| Query knowledge base |
| RAG query |
| 命令 | 描述 |
|---|---|
| 调用Bedrock Agent |
| 查询知识库 |
| RAG查询 |
Best Practices
最佳实践
Cost Optimization
成本优化
- Use appropriate models: Smaller models for simple tasks
- Set max_tokens: Limit output length when possible
- Cache responses: For repeated identical queries
- Batch when possible: Use batch inference for bulk processing
- Monitor usage: Set up CloudWatch alarms for cost
- 选择合适的模型:简单任务使用轻量模型
- 设置max_tokens:尽可能限制输出长度
- 缓存响应:针对重复的相同查询
- 批量处理:对大规模处理使用批量推理
- 监控使用情况:设置CloudWatch告警控制成本
Performance
性能优化
- Use streaming: For better user experience with long outputs
- Connection pooling: Reuse boto3 clients
- Regional deployment: Use closest region to reduce latency
- Provisioned throughput: For consistent high-volume workloads
- 使用流式响应:长输出场景提升用户体验
- 连接池:复用boto3客户端
- 地域部署:使用最近的地域降低延迟
- 预置吞吐量:针对持续高负载场景
Security
安全
- Least privilege IAM: Only grant needed model access
- VPC endpoints: Keep traffic private
- Guardrails: Implement content filtering
- Audit with CloudTrail: Track model invocations
- 最小权限IAM策略:仅授予必要的模型访问权限
- VPC端点:保持流量私有
- 内容防护:实现内容过滤
- CloudTrail审计:跟踪模型调用记录
IAM Permissions
IAM权限示例
json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
]
}
]
}json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
]
}
]
}Troubleshooting
故障排查
AccessDeniedException
AccessDeniedException
Causes:
- Model access not enabled in console
- IAM policy missing
bedrock:InvokeModel - Wrong model ID or region
Debug:
bash
undefined原因:
- 未在控制台启用模型访问权限
- IAM策略缺少权限
bedrock:InvokeModel - 模型ID或地域错误
调试方法:
bash
undefinedCheck model access status
检查模型访问状态
aws bedrock list-foundation-models
--query 'modelSummaries[?modelId==]'
--query 'modelSummaries[?modelId==
anthropic.claude-3-sonnet-20240229-v1:0aws bedrock list-foundation-models
--query 'modelSummaries[?modelId==]'
--query 'modelSummaries[?modelId==
anthropic.claude-3-sonnet-20240229-v1:0Test IAM permissions
测试IAM权限
aws iam simulate-principal-policy
--policy-source-arn arn:aws:iam::123456789012:role/my-role
--action-names bedrock:InvokeModel
--resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
--policy-source-arn arn:aws:iam::123456789012:role/my-role
--action-names bedrock:InvokeModel
--resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
undefinedaws iam simulate-principal-policy
--policy-source-arn arn:aws:iam::123456789012:role/my-role
--action-names bedrock:InvokeModel
--resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
--policy-source-arn arn:aws:iam::123456789012:role/my-role
--action-names bedrock:InvokeModel
--resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
undefinedModelNotReadyException
ModelNotReadyException
Cause: Model is still being provisioned or temporarily unavailable.
Solution: Implement retry with exponential backoff:
python
import time
from botocore.exceptions import ClientError
def invoke_with_retry(bedrock, body, max_retries=3):
for attempt in range(max_retries):
try:
return bedrock.invoke_model(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
body=json.dumps(body)
)
except ClientError as e:
if e.response['Error']['Code'] == 'ModelNotReadyException':
time.sleep(2 ** attempt)
else:
raise
raise Exception('Max retries exceeded')原因:模型仍在预置中或暂时不可用。
解决方案:实现带指数退避的重试机制:
python
import time
from botocore.exceptions import ClientError
def invoke_with_retry(bedrock, body, max_retries=3):
for attempt in range(max_retries):
try:
return bedrock.invoke_model(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
body=json.dumps(body)
)
except ClientError as e:
if e.response['Error']['Code'] == 'ModelNotReadyException':
time.sleep(2 ** attempt)
else:
raise
raise Exception('超出最大重试次数')ThrottlingException
ThrottlingException
Causes:
- Exceeded on-demand quota
- Too many concurrent requests
Solutions:
- Request quota increase
- Implement exponential backoff
- Consider provisioned throughput
原因:
- 超出按需配额
- 并发请求过多
解决方案:
- 申请配额提升
- 实现指数退避
- 考虑使用预置吞吐量
ValidationException
ValidationException
Common issues:
- Invalid model ID
- Malformed request body
- max_tokens exceeds model limit
Debug:
python
undefined常见问题:
- 无效的模型ID
- 请求体格式错误
- max_tokens超出模型限制
调试方法:
python
undefinedCheck model-specific requirements
检查模型特定要求
aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'
undefinedaws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'
undefined