bedrock

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

AWS Bedrock

Amazon Bedrock provides access to foundation models (FMs) from AI companies through a unified API. Build generative AI applications with text generation, embeddings, and image generation capabilities.

Amazon Bedrock通过统一API提供来自多家AI公司的基础模型（FM）访问权限。您可以借助它构建具备文本生成、嵌入向量生成和图像生成能力的生成式AI应用。

Core Concepts

核心概念

Foundation Models

基础模型

Pre-trained models available through Bedrock:

Claude (Anthropic): Text generation, analysis, coding
Titan (Amazon): Text, embeddings, image generation
Llama (Meta): Open-weight text generation
Mistral: Efficient text generation
Stable Diffusion (Stability AI): Image generation

Bedrock提供的预训练模型包括：

Claude（Anthropic）：文本生成、分析、代码编写
Titan（Amazon）：文本生成、嵌入向量生成、图像生成
Llama（Meta）：开源权重文本生成
Mistral：高效文本生成
Stable Diffusion（Stability AI）：图像生成

Model Access

模型访问权限

Models must be enabled in your account before use:

Request access in Bedrock console
Some models require acceptance of EULAs
Access is region-specific

使用模型前需在您的账户中启用：

在Bedrock控制台申请访问权限
部分模型需要接受最终用户许可协议（EULA）
访问权限具有地域特异性

Inference Types

推理类型

Type	Use Case	Pricing
On-Demand	Variable workloads	Per token
Provisioned Throughput	Consistent high-volume	Hourly commitment
Batch Inference	Async large-scale	Discounted per token

类型	适用场景	定价方式
按需推理	可变负载场景	按token计费
预置吞吐量	持续高负载场景	按小时承诺计费
批量推理	异步大规模处理	优惠的按token计费

Common Patterns

常见模式

Invoke Model (Text Generation)

调用模型（文本生成）

AWS CLI:

bash

undefined

AWS CLI:

bash

undefined

Invoke Claude

调用Claude

aws bedrock-runtime invoke-model
--model-id anthropic.claude-3-sonnet-20240229-v1:0
--content-type application/json
--accept application/json
--body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Explain AWS Lambda in 3 sentences."} ] }'
response.json

cat response.json | jq -r '.content[0].text'


**boto3:**

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def invoke_claude(prompt, max_tokens=1024):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': max_tokens,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']

cat response.json | jq -r '.content[0].text'


**boto3:**

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def invoke_claude(prompt, max_tokens=1024):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': max_tokens,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']

Usage

使用示例

response = invoke_claude('What is Amazon S3?') print(response)

undefined

response = invoke_claude('什么是Amazon S3？') print(response)

undefined

Streaming Response

流式响应

python

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def stream_claude(prompt):
    response = bedrock.invoke_model_with_response_stream(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    for event in response['body']:
        chunk = json.loads(event['chunk']['bytes'])
        if chunk['type'] == 'content_block_delta':
            yield chunk['delta'].get('text', '')

python

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def stream_claude(prompt):
    response = bedrock.invoke_model_with_response_stream(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    for event in response['body']:
        chunk = json.loads(event['chunk']['bytes'])
        if chunk['type'] == 'content_block_delta':
            yield chunk['delta'].get('text', '')

Usage

使用示例

for text in stream_claude('Write a haiku about cloud computing.'): print(text, end='', flush=True)

undefined

for text in stream_claude('写一首关于云计算的俳句。'): print(text, end='', flush=True)

undefined

Generate Embeddings

生成嵌入向量

python

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def get_embedding(text):
    response = bedrock.invoke_model(
        modelId='amazon.titan-embed-text-v2:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'inputText': text,
            'dimensions': 1024,
            'normalize': True
        })
    )

    result = json.loads(response['body'].read())
    return result['embedding']

python

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def get_embedding(text):
    response = bedrock.invoke_model(
        modelId='amazon.titan-embed-text-v2:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'inputText': text,
            'dimensions': 1024,
            'normalize': True
        })
    )

    result = json.loads(response['body'].read())
    return result['embedding']

Usage

使用示例

embedding = get_embedding('AWS Lambda is a serverless compute service.') print(f'Embedding dimension: {len(embedding)}')

undefined

embedding = get_embedding('AWS Lambda是一种无服务器计算服务。') print(f'嵌入向量维度: {len(embedding)}')

undefined

Conversation with History

带历史记录的对话

python

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

class Conversation:
    def __init__(self, system_prompt=None):
        self.messages = []
        self.system = system_prompt

    def chat(self, user_message):
        self.messages.append({
            'role': 'user',
            'content': user_message
        })

        body = {
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': self.messages
        }

        if self.system:
            body['system'] = self.system

        response = bedrock.invoke_model(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps(body)
        )

        result = json.loads(response['body'].read())
        assistant_message = result['content'][0]['text']

        self.messages.append({
            'role': 'assistant',
            'content': assistant_message
        })

        return assistant_message

python

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

class Conversation:
    def __init__(self, system_prompt=None):
        self.messages = []
        self.system = system_prompt

    def chat(self, user_message):
        self.messages.append({
            'role': 'user',
            'content': user_message
        })

        body = {
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': self.messages
        }

        if self.system:
            body['system'] = self.system

        response = bedrock.invoke_model(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps(body)
        )

        result = json.loads(response['body'].read())
        assistant_message = result['content'][0]['text']

        self.messages.append({
            'role': 'assistant',
            'content': assistant_message
        })

        return assistant_message

Usage

使用示例

conv = Conversation(system_prompt='You are an AWS solutions architect.') print(conv.chat('What database should I use for a chat application?')) print(conv.chat('What about for time-series data?'))

undefined

conv = Conversation(system_prompt='您是一位AWS解决方案架构师。') print(conv.chat('我应该为聊天应用选择哪种数据库？')) print(conv.chat('那时间序列数据呢？'))

undefined

List Available Models

列出可用模型

bash

undefined

bash

undefined

List all foundation models

列出所有基础模型

aws bedrock list-foundation-models
--query 'modelSummaries[*].[modelId,modelName,providerName]'
--output table

Filter by provider

按供应商筛选

aws bedrock list-foundation-models
--by-provider anthropic
--query 'modelSummaries[*].modelId'

Get model details

获取模型详情

aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0

undefined

aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0

undefined

Request Model Access

申请模型访问权限

bash

undefined

bash

undefined

List model access status

列出模型访问权限状态

aws bedrock list-foundation-model-agreement-offers
--model-id anthropic.claude-3-sonnet-20240229-v1:0

undefined

aws bedrock list-foundation-model-agreement-offers
--model-id anthropic.claude-3-sonnet-20240229-v1:0

undefined

CLI Reference

CLI参考

Bedrock (Control Plane)

Bedrock（控制平面）

Command	Description
`aws bedrock list-foundation-models`	List available models
`aws bedrock get-foundation-model`	Get model details
`aws bedrock list-custom-models`	List fine-tuned models
`aws bedrock create-model-customization-job`	Start fine-tuning
`aws bedrock list-provisioned-model-throughputs`	List provisioned capacity

命令	描述
`aws bedrock list-foundation-models`	列出可用模型
`aws bedrock get-foundation-model`	获取模型详情
`aws bedrock list-custom-models`	列出微调后的模型
`aws bedrock create-model-customization-job`	启动微调任务
`aws bedrock list-provisioned-model-throughputs`	列出预置容量

Bedrock Runtime (Data Plane)

Bedrock Runtime（数据平面）

Command	Description
`aws bedrock-runtime invoke-model`	Invoke model synchronously
`aws bedrock-runtime invoke-model-with-response-stream`	Invoke with streaming
`aws bedrock-runtime converse`	Multi-turn conversation API
`aws bedrock-runtime converse-stream`	Streaming conversation

命令	描述
`aws bedrock-runtime invoke-model`	同步调用模型
`aws bedrock-runtime invoke-model-with-response-stream`	流式调用模型
`aws bedrock-runtime converse`	多轮对话API
`aws bedrock-runtime converse-stream`	流式对话

Bedrock Agent Runtime

Command	Description
`aws bedrock-agent-runtime invoke-agent`	Invoke a Bedrock agent
`aws bedrock-agent-runtime retrieve`	Query knowledge base
`aws bedrock-agent-runtime retrieve-and-generate`	RAG query

命令	描述
`aws bedrock-agent-runtime invoke-agent`	调用Bedrock Agent
`aws bedrock-agent-runtime retrieve`	查询知识库
`aws bedrock-agent-runtime retrieve-and-generate`	RAG查询

Best Practices

最佳实践

Cost Optimization

成本优化

Use appropriate models: Smaller models for simple tasks
Set max_tokens: Limit output length when possible
Cache responses: For repeated identical queries
Batch when possible: Use batch inference for bulk processing
Monitor usage: Set up CloudWatch alarms for cost

选择合适的模型：简单任务使用轻量模型
设置max_tokens：尽可能限制输出长度
缓存响应：针对重复的相同查询
批量处理：对大规模处理使用批量推理
监控使用情况：设置CloudWatch告警控制成本

Performance

性能优化

Use streaming: For better user experience with long outputs
Connection pooling: Reuse boto3 clients
Regional deployment: Use closest region to reduce latency
Provisioned throughput: For consistent high-volume workloads

使用流式响应：长输出场景提升用户体验
连接池：复用boto3客户端
地域部署：使用最近的地域降低延迟
预置吞吐量：针对持续高负载场景

Security

安全

Least privilege IAM: Only grant needed model access
VPC endpoints: Keep traffic private
Guardrails: Implement content filtering
Audit with CloudTrail: Track model invocations

最小权限IAM策略：仅授予必要的模型访问权限
VPC端点：保持流量私有
内容防护：实现内容过滤
CloudTrail审计：跟踪模型调用记录

IAM Permissions

IAM权限示例

json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
      ]
    }
  ]
}

json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
      ]
    }
  ]
}

Troubleshooting

故障排查

AccessDeniedException

Causes:

Model access not enabled in console
IAM policy missing
```
bedrock:InvokeModel
```
Wrong model ID or region

Debug:

bash

undefined

原因：

未在控制台启用模型访问权限
IAM策略缺少
```
bedrock:InvokeModel
```
权限
模型ID或地域错误

调试方法：

bash

undefined

Check model access status

检查模型访问状态

aws bedrock list-foundation-models
--query 'modelSummaries[?modelId==

anthropic.claude-3-sonnet-20240229-v1:0

aws bedrock list-foundation-models
--query 'modelSummaries[?modelId==

anthropic.claude-3-sonnet-20240229-v1:0

Test IAM permissions

测试IAM权限

aws iam simulate-principal-policy
--policy-source-arn arn:aws:iam::123456789012:role/my-role
--action-names bedrock:InvokeModel
--resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"

undefined

undefined

ModelNotReadyException

Cause: Model is still being provisioned or temporarily unavailable.

Solution: Implement retry with exponential backoff:

python

import time
from botocore.exceptions import ClientError

def invoke_with_retry(bedrock, body, max_retries=3):
    for attempt in range(max_retries):
        try:
            return bedrock.invoke_model(
                modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                body=json.dumps(body)
            )
        except ClientError as e:
            if e.response['Error']['Code'] == 'ModelNotReadyException':
                time.sleep(2 ** attempt)
            else:
                raise
    raise Exception('Max retries exceeded')

原因：模型仍在预置中或暂时不可用。

解决方案：实现带指数退避的重试机制：

python

import time
from botocore.exceptions import ClientError

def invoke_with_retry(bedrock, body, max_retries=3):
    for attempt in range(max_retries):
        try:
            return bedrock.invoke_model(
                modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                body=json.dumps(body)
            )
        except ClientError as e:
            if e.response['Error']['Code'] == 'ModelNotReadyException':
                time.sleep(2 ** attempt)
            else:
                raise
    raise Exception('超出最大重试次数')

ThrottlingException

Causes:

Exceeded on-demand quota
Too many concurrent requests

Solutions:

Request quota increase
Implement exponential backoff
Consider provisioned throughput

原因：

超出按需配额
并发请求过多

解决方案：

申请配额提升
实现指数退避
考虑使用预置吞吐量

ValidationException

Common issues:

Invalid model ID
Malformed request body
max_tokens exceeds model limit

Debug:

python

undefined

常见问题：

无效的模型ID
请求体格式错误
max_tokens超出模型限制

调试方法：

python

undefined

Check model-specific requirements

检查模型特定要求

aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'

undefined

aws bedrock get-foundation-model
--model-identifier anthropic.claude-3-sonnet-20240229-v1:0
--query 'modelDetails.inferenceTypesSupported'

undefined

bedrock

Original

Translation

AWS Bedrock

AWS Bedrock

Table of Contents

目录

Core Concepts

核心概念

Foundation Models

基础模型

Model Access

模型访问权限

Inference Types

推理类型

Common Patterns

常见模式

Invoke Model (Text Generation)

调用模型（文本生成）

Invoke Claude

调用Claude

Usage

使用示例

Streaming Response

流式响应

Usage

使用示例

Generate Embeddings

生成嵌入向量

Usage

使用示例

Conversation with History

带历史记录的对话

Usage

使用示例

List Available Models

列出可用模型

List all foundation models

列出所有基础模型

Filter by provider

按供应商筛选

Get model details

获取模型详情

Request Model Access

申请模型访问权限

List model access status

列出模型访问权限状态

CLI Reference

CLI参考

Bedrock (Control Plane)

Bedrock（控制平面）

Bedrock Runtime (Data Plane)

Bedrock Runtime（数据平面）

Bedrock Agent Runtime

Bedrock Agent Runtime

Best Practices

最佳实践

Cost Optimization

成本优化

Performance

性能优化

Security

安全

IAM Permissions

IAM权限示例

Troubleshooting

故障排查

AccessDeniedException

AccessDeniedException

Check model access status

检查模型访问状态

Test IAM permissions

测试IAM权限

ModelNotReadyException

ModelNotReadyException

ThrottlingException

ThrottlingException

ValidationException

ValidationException

Check model-specific requirements