bedrock

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

AWS Bedrock

AWS Bedrock

Amazon Bedrock provides access to foundation models (FMs) from AI companies through a unified API. Build generative AI applications with text generation, embeddings, and image generation capabilities.
Amazon Bedrock通过统一API提供来自多家AI公司的基础模型(FM)访问权限。您可以借助它构建具备文本生成、嵌入向量生成和图像生成能力的生成式AI应用。

Table of Contents

目录

Core Concepts

核心概念

Foundation Models

基础模型

Pre-trained models available through Bedrock:
  • Claude (Anthropic): Text generation, analysis, coding
  • Titan (Amazon): Text, embeddings, image generation
  • Llama (Meta): Open-weight text generation
  • Mistral: Efficient text generation
  • Stable Diffusion (Stability AI): Image generation
Bedrock提供的预训练模型包括:
  • Claude(Anthropic):文本生成、分析、代码编写
  • Titan(Amazon):文本生成、嵌入向量生成、图像生成
  • Llama(Meta):开源权重文本生成
  • Mistral:高效文本生成
  • Stable Diffusion(Stability AI):图像生成

Model Access

模型访问权限

Models must be enabled in your account before use:
  • Request access in Bedrock console
  • Some models require acceptance of EULAs
  • Access is region-specific
使用模型前需在您的账户中启用:
  • 在Bedrock控制台申请访问权限
  • 部分模型需要接受最终用户许可协议(EULA)
  • 访问权限具有地域特异性

Inference Types

推理类型

TypeUse CasePricing
On-DemandVariable workloadsPer token
Provisioned ThroughputConsistent high-volumeHourly commitment
Batch InferenceAsync large-scaleDiscounted per token
类型适用场景定价方式
按需推理可变负载场景按token计费
预置吞吐量持续高负载场景按小时承诺计费
批量推理异步大规模处理优惠的按token计费

Common Patterns

常见模式

Invoke Model (Text Generation)

调用模型(文本生成)

AWS CLI:
bash
undefined
AWS CLI:
bash
undefined

Invoke Claude

调用Claude

aws bedrock-runtime invoke-model
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--content-type application/json
--accept application/json
--body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Explain AWS Lambda in 3 sentences."} ] }'
response.json
cat response.json | jq -r '.content[0].text'

**boto3:**

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def invoke_claude(prompt, max_tokens=1024):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': max_tokens,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']
aws bedrock-runtime invoke-model
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--content-type application/json
--accept application/json
--body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "用3句话解释AWS Lambda。"} ] }'
response.json
cat response.json | jq -r '.content[0].text'

**boto3:**

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def invoke_claude(prompt, max_tokens=1024):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': max_tokens,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']

Usage

使用示例

response = invoke_claude('What is Amazon S3?') print(response)
undefined
response = invoke_claude('什么是Amazon S3?') print(response)
undefined

Streaming Response

流式响应

python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def stream_claude(prompt):
    response = bedrock.invoke_model_with_response_stream(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    for event in response['body']:
        chunk = json.loads(event['chunk']['bytes'])
        if chunk['type'] == 'content_block_delta':
            yield chunk['delta'].get('text', '')
python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def stream_claude(prompt):
    response = bedrock.invoke_model_with_response_stream(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    for event in response['body']:
        chunk = json.loads(event['chunk']['bytes'])
        if chunk['type'] == 'content_block_delta':
            yield chunk['delta'].get('text', '')

Usage

使用示例

for text in stream_claude('Write a haiku about cloud computing.'): print(text, end='', flush=True)
undefined
for text in stream_claude('写一首关于云计算的俳句。'): print(text, end='', flush=True)
undefined

Generate Embeddings

生成嵌入向量

python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def get_embedding(text):
    response = bedrock.invoke_model(
        modelId='amazon.titan-embed-text-v2:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'inputText': text,
            'dimensions': 1024,
            'normalize': True
        })
    )

    result = json.loads(response['body'].read())
    return result['embedding']
python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def get_embedding(text):
    response = bedrock.invoke_model(
        modelId='amazon.titan-embed-text-v2:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'inputText': text,
            'dimensions': 1024,
            'normalize': True
        })
    )

    result = json.loads(response['body'].read())
    return result['embedding']

Usage

使用示例

embedding = get_embedding('AWS Lambda is a serverless compute service.') print(f'Embedding dimension: {len(embedding)}')
undefined
embedding = get_embedding('AWS Lambda是一种无服务器计算服务。') print(f'嵌入向量维度: {len(embedding)}')
undefined

Conversation with History

带历史记录的对话

python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

class Conversation:
    def __init__(self, system_prompt=None):
        self.messages = []
        self.system = system_prompt

    def chat(self, user_message):
        self.messages.append({
            'role': 'user',
            'content': user_message
        })

        body = {
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': self.messages
        }

        if self.system:
            body['system'] = self.system

        response = bedrock.invoke_model(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps(body)
        )

        result = json.loads(response['body'].read())
        assistant_message = result['content'][0]['text']

        self.messages.append({
            'role': 'assistant',
            'content': assistant_message
        })

        return assistant_message
python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

class Conversation:
    def __init__(self, system_prompt=None):
        self.messages = []
        self.system = system_prompt

    def chat(self, user_message):
        self.messages.append({
            'role': 'user',
            'content': user_message
        })

        body = {
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': self.messages
        }

        if self.system:
            body['system'] = self.system

        response = bedrock.invoke_model(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps(body)
        )

        result = json.loads(response['body'].read())
        assistant_message = result['content'][0]['text']

        self.messages.append({
            'role': 'assistant',
            'content': assistant_message
        })

        return assistant_message

Usage

使用示例

conv = Conversation(system_prompt='You are an AWS solutions architect.') print(conv.chat('What database should I use for a chat application?')) print(conv.chat('What about for time-series data?'))
undefined
conv = Conversation(system_prompt='您是一位AWS解决方案架构师。') print(conv.chat('我应该为聊天应用选择哪种数据库?')) print(conv.chat('那时间序列数据呢?'))
undefined

List Available Models

列出可用模型

bash
undefined
bash
undefined

List all foundation models

列出所有基础模型

aws bedrock list-foundation-models
--query 'modelSummaries[*].[modelId,modelName,providerName]'
--output table
aws bedrock list-foundation-models
--query 'modelSummaries[*].[modelId,modelName,providerName]'
--output table

Filter by provider

按供应商筛选

aws bedrock list-foundation-models
--by-provider anthropic
--query 'modelSummaries[*].modelId'
aws bedrock list-foundation-models
--by-provider anthropic
--query 'modelSummaries[*].modelId'

Get model details

获取模型详情

aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
undefined
aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
undefined

Request Model Access

申请模型访问权限

bash
undefined
bash
undefined

List model access status

列出模型访问权限状态

aws bedrock list-foundation-model-agreement-offers
--model-id anthropic.claude-3-sonnet-20240229-v1:0
undefined
aws bedrock list-foundation-model-agreement-offers
--model-id anthropic.claude-3-sonnet-20240229-v1:0
undefined

CLI Reference

CLI参考

Bedrock (Control Plane)

Bedrock(控制平面)

CommandDescription
aws bedrock list-foundation-models
List available models
aws bedrock get-foundation-model
Get model details
aws bedrock list-custom-models
List fine-tuned models
aws bedrock create-model-customization-job
Start fine-tuning
aws bedrock list-provisioned-model-throughputs
List provisioned capacity
命令描述
aws bedrock list-foundation-models
列出可用模型
aws bedrock get-foundation-model
获取模型详情
aws bedrock list-custom-models
列出微调后的模型
aws bedrock create-model-customization-job
启动微调任务
aws bedrock list-provisioned-model-throughputs
列出预置容量

Bedrock Runtime (Data Plane)

Bedrock Runtime(数据平面)

CommandDescription
aws bedrock-runtime invoke-model
Invoke model synchronously
aws bedrock-runtime invoke-model-with-response-stream
Invoke with streaming
aws bedrock-runtime converse
Multi-turn conversation API
aws bedrock-runtime converse-stream
Streaming conversation
命令描述
aws bedrock-runtime invoke-model
同步调用模型
aws bedrock-runtime invoke-model-with-response-stream
流式调用模型
aws bedrock-runtime converse
多轮对话API
aws bedrock-runtime converse-stream
流式对话

Bedrock Agent Runtime

Bedrock Agent Runtime

CommandDescription
aws bedrock-agent-runtime invoke-agent
Invoke a Bedrock agent
aws bedrock-agent-runtime retrieve
Query knowledge base
aws bedrock-agent-runtime retrieve-and-generate
RAG query
命令描述
aws bedrock-agent-runtime invoke-agent
调用Bedrock Agent
aws bedrock-agent-runtime retrieve
查询知识库
aws bedrock-agent-runtime retrieve-and-generate
RAG查询

Best Practices

最佳实践

Cost Optimization

成本优化

  • Use appropriate models: Smaller models for simple tasks
  • Set max_tokens: Limit output length when possible
  • Cache responses: For repeated identical queries
  • Batch when possible: Use batch inference for bulk processing
  • Monitor usage: Set up CloudWatch alarms for cost
  • 选择合适的模型:简单任务使用轻量模型
  • 设置max_tokens:尽可能限制输出长度
  • 缓存响应:针对重复的相同查询
  • 批量处理:对大规模处理使用批量推理
  • 监控使用情况:设置CloudWatch告警控制成本

Performance

性能优化

  • Use streaming: For better user experience with long outputs
  • Connection pooling: Reuse boto3 clients
  • Regional deployment: Use closest region to reduce latency
  • Provisioned throughput: For consistent high-volume workloads
  • 使用流式响应:长输出场景提升用户体验
  • 连接池:复用boto3客户端
  • 地域部署:使用最近的地域降低延迟
  • 预置吞吐量:针对持续高负载场景

Security

安全

  • Least privilege IAM: Only grant needed model access
  • VPC endpoints: Keep traffic private
  • Guardrails: Implement content filtering
  • Audit with CloudTrail: Track model invocations
  • 最小权限IAM策略:仅授予必要的模型访问权限
  • VPC端点:保持流量私有
  • 内容防护:实现内容过滤
  • CloudTrail审计:跟踪模型调用记录

IAM Permissions

IAM权限示例

json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
      ]
    }
  ]
}
json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
      ]
    }
  ]
}

Troubleshooting

故障排查

AccessDeniedException

AccessDeniedException

Causes:
  • Model access not enabled in console
  • IAM policy missing
    bedrock:InvokeModel
  • Wrong model ID or region
Debug:
bash
undefined
原因
  • 未在控制台启用模型访问权限
  • IAM策略缺少
    bedrock:InvokeModel
    权限
  • 模型ID或地域错误
调试方法
bash
undefined

Check model access status

检查模型访问状态

aws bedrock list-foundation-models
--query 'modelSummaries[?modelId==
anthropic.claude-3-sonnet-20240229-v1:0
]'
aws bedrock list-foundation-models
--query 'modelSummaries[?modelId==
anthropic.claude-3-sonnet-20240229-v1:0
]'

Test IAM permissions

测试IAM权限

aws iam simulate-principal-policy
--policy-source-arn arn:aws:iam::123456789012:role/my-role
--action-names bedrock:InvokeModel
--resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
undefined
aws iam simulate-principal-policy
--policy-source-arn arn:aws:iam::123456789012:role/my-role
--action-names bedrock:InvokeModel
--resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
undefined

ModelNotReadyException

ModelNotReadyException

Cause: Model is still being provisioned or temporarily unavailable.
Solution: Implement retry with exponential backoff:
python
import time
from botocore.exceptions import ClientError

def invoke_with_retry(bedrock, body, max_retries=3):
    for attempt in range(max_retries):
        try:
            return bedrock.invoke_model(
                modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                body=json.dumps(body)
            )
        except ClientError as e:
            if e.response['Error']['Code'] == 'ModelNotReadyException':
                time.sleep(2 ** attempt)
            else:
                raise
    raise Exception('Max retries exceeded')
原因:模型仍在预置中或暂时不可用。
解决方案:实现带指数退避的重试机制:
python
import time
from botocore.exceptions import ClientError

def invoke_with_retry(bedrock, body, max_retries=3):
    for attempt in range(max_retries):
        try:
            return bedrock.invoke_model(
                modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                body=json.dumps(body)
            )
        except ClientError as e:
            if e.response['Error']['Code'] == 'ModelNotReadyException':
                time.sleep(2 ** attempt)
            else:
                raise
    raise Exception('超出最大重试次数')

ThrottlingException

ThrottlingException

Causes:
  • Exceeded on-demand quota
  • Too many concurrent requests
Solutions:
  • Request quota increase
  • Implement exponential backoff
  • Consider provisioned throughput
原因
  • 超出按需配额
  • 并发请求过多
解决方案
  • 申请配额提升
  • 实现指数退避
  • 考虑使用预置吞吐量

ValidationException

ValidationException

Common issues:
  • Invalid model ID
  • Malformed request body
  • max_tokens exceeds model limit
Debug:
python
undefined
常见问题
  • 无效的模型ID
  • 请求体格式错误
  • max_tokens超出模型限制
调试方法
python
undefined

Check model-specific requirements

检查模型特定要求

aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'
undefined
aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'
undefined

References

参考资料