bedrock-knowledge-bases

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Amazon Bedrock Knowledge Bases

Amazon Bedrock 知识库

Amazon Bedrock Knowledge Bases is a fully managed RAG (Retrieval-Augmented Generation) solution that handles data ingestion, embedding generation, vector storage, retrieval with reranking, source attribution, and session context management.

Amazon Bedrock 知识库是一款全托管的RAG（检索增强生成）解决方案，可处理数据导入、嵌入生成、向量存储、带重排序的检索、来源归因以及会话上下文管理等工作。

Overview

概述

What It Does

核心功能

Amazon Bedrock Knowledge Bases provides:

Data Ingestion: Automatically process documents from S3, web, Confluence, SharePoint, Salesforce
Embedding Generation: Convert text to vectors using foundation models
Vector Storage: Store embeddings in multiple vector database options
Retrieval: Semantic and hybrid search with metadata filtering
Generation: RAG workflows with source attribution
Session Management: Multi-turn conversations with context
Chunking Strategies: Fixed, semantic, hierarchical, and custom chunking

Amazon Bedrock 知识库提供以下能力：

数据导入：自动处理来自S3、网页、Confluence、SharePoint、Salesforce的文档
嵌入生成：使用基础模型将文本转换为向量
向量存储：支持多种向量数据库选项存储嵌入向量
检索能力：支持语义搜索与混合搜索，可结合元数据过滤
生成能力：带来源归因的RAG工作流
会话管理：支持带上下文的多轮对话
分块策略：固定大小、语义、分层及自定义分块

When to Use This Skill

适用场景

Use this skill when you need to:

Build RAG applications for document Q&A
Implement semantic search over enterprise knowledge
Create chatbots with knowledge bases
Integrate retrieval with Bedrock Agents
Configure optimal chunking strategies
Query documents with source attribution
Manage multi-turn conversations with context
Optimize RAG performance and cost

在以下场景中可使用该能力：

构建面向文档问答的RAG应用
针对企业知识实现语义搜索
创建集成知识库的聊天机器人
将检索能力与Bedrock Agent集成
配置最优分块策略
带来源归因的文档查询
管理带上下文的多轮对话
优化RAG的性能与成本

Key Capabilities

关键特性

Multiple Vector Store Options: OpenSearch, S3 Vectors, Neptune, Pinecone, MongoDB, Redis
Flexible Data Sources: S3, web crawlers, Confluence, SharePoint, Salesforce
Advanced Chunking: Fixed-size, semantic, hierarchical, custom Lambda
Hybrid Search: Combine semantic (vector) and keyword search
Session Management: Built-in conversation context tracking
GraphRAG: Relationship-aware retrieval with Neptune Analytics
Cost Optimization: S3 Vectors for up to 90% storage savings

多向量存储选项：OpenSearch、S3 Vectors、Neptune、Pinecone、MongoDB、Redis
灵活数据源：S3、网页爬虫、Confluence、SharePoint、Salesforce
高级分块：固定大小、语义、分层、自定义Lambda分块
混合搜索：结合语义（向量）与关键词搜索
会话管理：内置对话上下文跟踪
GraphRAG：借助Neptune Analytics实现关系感知检索
成本优化：使用S3 Vectors可节省高达90%的存储成本

Quick Start

快速开始

Basic RAG Workflow

基础RAG工作流

python

import boto3
import json

python

import boto3
import json

Initialize clients

初始化客户端

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1') bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

1. Create Knowledge Base

1. 创建知识库

kb_response = bedrock_agent.create_knowledge_base( name='enterprise-docs-kb', description='Company documentation knowledge base', roleArn='arn:aws:iam::123456789012:role/BedrockKBRole', knowledgeBaseConfiguration={ 'type': 'VECTOR', 'vectorKnowledgeBaseConfiguration': { 'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0' } }, storageConfiguration={ 'type': 'OPENSEARCH_SERVERLESS', 'opensearchServerlessConfiguration': { 'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection', 'vectorIndexName': 'bedrock-knowledge-base-index', 'fieldMapping': { 'vectorField': 'bedrock-knowledge-base-default-vector', 'textField': 'AMAZON_BEDROCK_TEXT_CHUNK', 'metadataField': 'AMAZON_BEDROCK_METADATA' } } } )

knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId'] print(f"Knowledge Base ID: {knowledge_base_id}")

2. Add S3 Data Source

2. 添加S3数据源

ds_response = bedrock_agent.create_data_source( knowledgeBaseId=knowledge_base_id, name='s3-documents', description='Company documents from S3', dataSourceConfiguration={ 'type': 'S3', 's3Configuration': { 'bucketArn': 'arn:aws:s3:::my-docs-bucket', 'inclusionPrefixes': ['documents/'] } }, vectorIngestionConfiguration={ 'chunkingConfiguration': { 'chunkingStrategy': 'FIXED_SIZE', 'fixedSizeChunkingConfiguration': { 'maxTokens': 512, 'overlapPercentage': 20 } } } )

data_source_id = ds_response['dataSource']['dataSourceId']

3. Start Ingestion

3. 启动导入任务

ingestion_response = bedrock_agent.start_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, description='Initial document ingestion' )

print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")

ingestion_response = bedrock_agent.start_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, description='Initial document ingestion' )

print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")

4. Query with Retrieve and Generate

4. 通过检索与生成API查询

response = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What is our vacation policy?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': knowledge_base_id, 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0', 'retrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 5, 'overrideSearchType': 'HYBRID' } } } } )

print(f"Answer: {response['output']['text']}") print(f"\nSources:") for citation in response['citations']: for reference in citation['retrievedReferences']: print(f" - {reference['location']['s3Location']['uri']}")

---

---

Vector Store Options

向量存储选项

1. Amazon OpenSearch Serverless

Best for: Production RAG applications with auto-scaling requirements

Benefits:

Fully managed, serverless operation
Auto-scaling compute and storage
High availability with multi-AZ deployment
Fast query performance

Configuration:

python

storageConfiguration={
    'type': 'OPENSEARCH_SERVERLESS',
    'opensearchServerlessConfiguration': {
        'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
        'vectorIndexName': 'bedrock-knowledge-base-index',
        'fieldMapping': {
            'vectorField': 'bedrock-knowledge-base-default-vector',
            'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
            'metadataField': 'AMAZON_BEDROCK_METADATA'
        }
    }
}

最佳适用场景：具备自动扩缩容需求的生产级RAG应用

优势:

全托管、无服务器运行
计算与存储自动扩缩容
多AZ部署，高可用性
快速查询性能

配置示例:

python

storageConfiguration={
    'type': 'OPENSEARCH_SERVERLESS',
    'opensearchServerlessConfiguration': {
        'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
        'vectorIndexName': 'bedrock-knowledge-base-index',
        'fieldMapping': {
            'vectorField': 'bedrock-knowledge-base-default-vector',
            'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
            'metadataField': 'AMAZON_BEDROCK_METADATA'
        }
    }
}

2. Amazon S3 Vectors (Preview)

2. Amazon S3 Vectors（预览版）

Best for: Cost-optimized, large-scale RAG applications

Benefits:

Up to 90% cost reduction for vector storage
Built-in vector support in S3
Subsecond query performance
Massive scale and durability

Ideal Use Cases:

Large document collections (millions of chunks)
Cost-sensitive applications
Archival knowledge bases
Low-to-medium QPS workloads

Configuration:

python

storageConfiguration={
    'type': 'S3_VECTORS',
    's3VectorsConfiguration': {
        'bucketArn': 'arn:aws:s3:::my-vector-bucket',
        'prefix': 'vectors/'
    }
}

Limitations:

Still in preview (no CloudFormation/CDK support yet)
Not suitable for high QPS, millisecond-latency requirements
Best for cost optimization over ultra-low latency

最佳适用场景：成本优化型、大规模RAG应用

优势:

向量存储成本降低高达90%
S3内置向量支持
亚秒级查询性能
海量扩展能力与持久性

理想场景:

大型文档集合（数百万分块）
对成本敏感的应用
归档类知识库
低至中等QPS的工作负载

配置示例:

python

storageConfiguration={
    'type': 'S3_VECTORS',
    's3VectorsConfiguration': {
        'bucketArn': 'arn:aws:s3:::my-vector-bucket',
        'prefix': 'vectors/'
    }
}

限制:

仍处于预览阶段（暂不支持CloudFormation/CDK）
不适用于高QPS、毫秒级延迟要求的场景
优先满足成本优化而非超低延迟需求

3. Amazon Neptune Analytics (GraphRAG)

3. Amazon Neptune Analytics（GraphRAG）

Best for: Interconnected knowledge domains requiring relationship-aware retrieval

Benefits:

Automatic graph creation linking related content
Improved retrieval accuracy through relationships
Comprehensive responses leveraging knowledge graph
Explainable results with relationship context

Use Cases:

Legal document analysis with case precedents
Scientific research with paper citations
Product catalogs with dependencies
Organizational knowledge with team relationships

Configuration:

python

storageConfiguration={
    'type': 'NEPTUNE_ANALYTICS',
    'neptuneAnalyticsConfiguration': {
        'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
        'vectorSearchConfiguration': {
            'vectorField': 'embedding'
        }
    }
}

最佳适用场景：需要关系感知检索的互联知识领域

优势:

自动创建关联相关内容的图谱
通过关系提升检索准确性
利用知识图谱生成全面响应
带关系上下文的可解释结果

适用场景:

带案例先例的法律文档分析
带论文引用的科研场景
带依赖关系的产品目录
带团队关系的组织知识

配置示例:

python

storageConfiguration={
    'type': 'NEPTUNE_ANALYTICS',
    'neptuneAnalyticsConfiguration': {
        'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
        'vectorSearchConfiguration': {
            'vectorField': 'embedding'
        }
    }
}

4. Amazon OpenSearch Service Managed Cluster

4. Amazon OpenSearch Service 托管集群

Best for: Existing OpenSearch infrastructure, advanced customization

Configuration:

python

storageConfiguration={
    'type': 'OPENSEARCH_SERVICE',
    'opensearchServiceConfiguration': {
        'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

最佳适用场景：已有OpenSearch基础设施、需要高级定制的场景

配置示例:

python

storageConfiguration={
    'type': 'OPENSEARCH_SERVICE',
    'opensearchServiceConfiguration': {
        'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

5. Third-Party Vector Databases

5. 第三方向量数据库

Pinecone:

python

storageConfiguration={
    'type': 'PINECONE',
    'pineconeConfiguration': {
        'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
        'namespace': 'bedrock-kb',
        'fieldMapping': {
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

MongoDB Atlas:

python

storageConfiguration={
    'type': 'MONGODB_ATLAS',
    'mongoDbAtlasConfiguration': {
        'endpoint': 'https://cluster0.mongodb.net',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
        'databaseName': 'bedrock_kb',
        'collectionName': 'vectors',
        'vectorIndexName': 'vector_index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Redis Enterprise Cloud:

python

storageConfiguration={
    'type': 'REDIS_ENTERPRISE_CLOUD',
    'redisEnterpriseCloudConfiguration': {
        'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Pinecone:

python

storageConfiguration={
    'type': 'PINECONE',
    'pineconeConfiguration': {
        'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
        'namespace': 'bedrock-kb',
        'fieldMapping': {
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

MongoDB Atlas:

python

storageConfiguration={
    'type': 'MONGODB_ATLAS',
    'mongoDbAtlasConfiguration': {
        'endpoint': 'https://cluster0.mongodb.net',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
        'databaseName': 'bedrock_kb',
        'collectionName': 'vectors',
        'vectorIndexName': 'vector_index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Redis Enterprise Cloud:

python

storageConfiguration={
    'type': 'REDIS_ENTERPRISE_CLOUD',
    'redisEnterpriseCloudConfiguration': {
        'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Data Source Configuration

数据源配置

1. Amazon S3

Supported File Types: PDF, TXT, MD, HTML, DOC, DOCX, CSV, XLS, XLSX

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-technical-docs',
    description='Technical documentation from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
            'exclusionPrefixes': ['docs/archive/']
        }
    }
)

支持文件类型：PDF、TXT、MD、HTML、DOC、DOCX、CSV、XLS、XLSX

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-technical-docs',
    description='Technical documentation from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
            'exclusionPrefixes': ['docs/archive/']
        }
    }
)

2. Web Crawler

2. 网页爬虫

Automatic website scraping and indexing:

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='company-website',
    description='Public company website content',
    dataSourceConfiguration={
        'type': 'WEB',
        'webConfiguration': {
            'sourceConfiguration': {
                'urlConfiguration': {
                    'seedUrls': [
                        {'url': 'https://www.example.com/docs'},
                        {'url': 'https://www.example.com/blog'}
                    ]
                }
            },
            'crawlerConfiguration': {
                'crawlerLimits': {
                    'rateLimit': 300  # Pages per minute
                }
            }
        }
    }
)

自动网站抓取与索引:

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='company-website',
    description='Public company website content',
    dataSourceConfiguration={
        'type': 'WEB',
        'webConfiguration': {
            'sourceConfiguration': {
                'urlConfiguration': {
                    'seedUrls': [
                        {'url': 'https://www.example.com/docs'},
                        {'url': 'https://www.example.com/blog'}
                    ]
                }
            },
            'crawlerConfiguration': {
                'crawlerLimits': {
                    'rateLimit': 300  # Pages per minute
                }
            }
        }
    }
)

3. Confluence

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='confluence-wiki',
    description='Company Confluence knowledge base',
    dataSourceConfiguration={
        'type': 'CONFLUENCE',
        'confluenceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.atlassian.net/wiki',
                'hostType': 'SAAS',
                'authType': 'BASIC',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Space',
                                'inclusionFilters': ['Engineering', 'Product'],
                                'exclusionFilters': ['Archive']
                            }
                        ]
                    }
                }
            }
        }
    }
)

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='confluence-wiki',
    description='Company Confluence knowledge base',
    dataSourceConfiguration={
        'type': 'CONFLUENCE',
        'confluenceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.atlassian.net/wiki',
                'hostType': 'SAAS',
                'authType': 'BASIC',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Space',
                                'inclusionFilters': ['Engineering', 'Product'],
                                'exclusionFilters': ['Archive']
                            }
                        ]
                    }
                }
            }
        }
    }
)

4. SharePoint

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='sharepoint-docs',
    description='SharePoint document library',
    dataSourceConfiguration={
        'type': 'SHAREPOINT',
        'sharePointConfiguration': {
            'sourceConfiguration': {
                'siteUrls': [
                    'https://company.sharepoint.com/sites/Engineering',
                    'https://company.sharepoint.com/sites/Product'
                ],
                'tenantId': 'tenant-id',
                'domain': 'company',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
            }
        }
    }
)

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='sharepoint-docs',
    description='SharePoint document library',
    dataSourceConfiguration={
        'type': 'SHAREPOINT',
        'sharePointConfiguration': {
            'sourceConfiguration': {
                'siteUrls': [
                    'https://company.sharepoint.com/sites/Engineering',
                    'https://company.sharepoint.com/sites/Product'
                ],
                'tenantId': 'tenant-id',
                'domain': 'company',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
            }
        }
    }
)

5. Salesforce

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='salesforce-knowledge',
    description='Salesforce knowledge articles',
    dataSourceConfiguration={
        'type': 'SALESFORCE',
        'salesforceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.my.salesforce.com',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Knowledge',
                                'inclusionFilters': ['Product_Documentation', 'Support_Articles']
                            }
                        ]
                    }
                }
            }
        }
    }
)

python

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='salesforce-knowledge',
    description='Salesforce knowledge articles',
    dataSourceConfiguration={
        'type': 'SALESFORCE',
        'salesforceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.my.salesforce.com',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Knowledge',
                                'inclusionFilters': ['Product_Documentation', 'Support_Articles']
                            }
                        ]
                    }
                }
            }
        }
    }
)

Chunking Strategies

分块策略

1. Fixed-Size Chunking

1. 固定大小分块

Best for: Simple documents with uniform structure

How it works: Splits text into chunks of fixed token size with overlap

Parameters:

```
maxTokens
```
: 200-8192 tokens (typically 512-1024)
```
overlapPercentage
```
: 10-50% (typically 20%)

Configuration:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'FIXED_SIZE',
        'fixedSizeChunkingConfiguration': {
            'maxTokens': 512,
            'overlapPercentage': 20
        }
    }
}

Use Cases:

Blog posts and articles
Technical documentation with consistent formatting
FAQs and Q&A content
Simple text files

Pros:

Fast and predictable
No additional costs
Easy to tune

Cons:

May split semantic units awkwardly
Doesn't respect document structure
Can break context mid-sentence

最佳适用场景：结构统一的简单文档

工作原理：将文本按固定token大小拆分，设置重叠部分

参数:

```
maxTokens
```
: 200-8192 token（通常为512-1024）
```
overlapPercentage
```
: 10-50%（通常为20%）

配置示例:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'FIXED_SIZE',
        'fixedSizeChunkingConfiguration': {
            'maxTokens': 512,
            'overlapPercentage': 20
        }
    }
}

适用场景:

博客文章
格式一致的技术文档
FAQ与问答内容
简单文本文件

优点:

快速且可预测
无额外成本
易于调优

缺点:

可能会生硬拆分语义单元
不尊重文档结构
可能在句子中间打断上下文

2. Semantic Chunking

2. 语义分块

Best for: Documents without clear boundaries (legal, technical, academic)

How it works: Uses sentence similarity to group related content

Parameters:

```
maxTokens
```
: 20-8192 tokens (typically 300-500)
```
bufferSize
```
: Number of neighboring sentences (default: 1)
```
breakpointPercentileThreshold
```
: Similarity threshold (recommended: 95%)

Configuration:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'SEMANTIC',
        'semanticChunkingConfiguration': {
            'maxTokens': 300,
            'bufferSize': 1,
            'breakpointPercentileThreshold': 95
        }
    }
}

Use Cases:

Legal documents and contracts
Academic papers
Technical specifications
Medical records
Research reports

Pros:

Preserves semantic meaning
Better context preservation
Improved retrieval accuracy

Cons:

Additional cost (foundation model usage)
Slower ingestion
Less predictable chunk sizes

Cost Consideration: Semantic chunking uses foundation models for similarity analysis, incurring additional costs beyond storage and retrieval.

最佳适用场景：无明确边界的文档（法律、技术、学术类）

工作原理：使用句子相似度对相关内容分组

参数:

```
maxTokens
```
: 20-8192 token（通常为300-500）
```
bufferSize
```
: 相邻句子数量（默认值：1）
```
breakpointPercentileThreshold
```
: 相似度阈值（推荐值：95%）

配置示例:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'SEMANTIC',
        'semanticChunkingConfiguration': {
            'maxTokens': 300,
            'bufferSize': 1,
            'breakpointPercentileThreshold': 95
        }
    }
}

适用场景:

法律文档与合同
学术论文
技术规范
医疗记录
研究报告

优点:

保留语义含义
更好的上下文保留
提升检索准确性

缺点:

额外成本（基础模型使用费用）
导入速度较慢
分块大小较难预测

成本考量：语义分块使用基础模型进行相似度分析，除存储与检索成本外会产生额外费用。

3. Hierarchical Chunking

3. 分层分块

Best for: Complex documents with nested structure

How it works: Creates parent and child chunks; retrieves child, returns parent for context

Parameters:

```
levelConfigurations
```
: Array of chunk sizes (parent → child)
```
overlapTokens
```
: Overlap between chunks

Configuration:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'HIERARCHICAL',
        'hierarchicalChunkingConfiguration': {
            'levelConfigurations': [
                {
                    'maxTokens': 1500  # Parent chunk (comprehensive context)
                },
                {
                    'maxTokens': 300   # Child chunk (focused retrieval)
                }
            ],
            'overlapTokens': 60
        }
    }
}

Use Cases:

Technical manuals with sections and subsections
Academic papers with abstract, sections, and subsections
Legal documents with articles and clauses
Product documentation with categories and details

How Retrieval Works:

Query matches against child chunks (fast, focused)
Returns parent chunks (comprehensive context)
Best of both: precision retrieval + complete context

Pros:

Optimal balance of precision and context
Excellent for nested documents
Better accuracy for complex queries

Cons:

More complex configuration
Larger storage footprint
Requires understanding of document structure

最佳适用场景：带嵌套结构的复杂文档

工作原理：创建父分块与子分块；检索子分块，返回父分块以提供上下文

参数:

```
levelConfigurations
```
: 分块大小数组（父→子）
```
overlapTokens
```
: 分块间的重叠token数

配置示例:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'HIERARCHICAL',
        'hierarchicalChunkingConfiguration': {
            'levelConfigurations': [
                {
                    'maxTokens': 1500  # 父分块（完整上下文）
                },
                {
                    'maxTokens': 300   # 子分块（聚焦检索）
                }
            ],
            'overlapTokens': 60
        }
    }
}

适用场景:

带章节与小节的技术手册
带摘要、章节与小节的学术论文
带条款的法律文档
带分类与详情的产品文档

检索工作流程:

查询与子分块匹配（快速、聚焦）
返回父分块（完整上下文）
兼顾精准检索与完整上下文的最优方案

优点:

精准度与上下文的最优平衡
适用于嵌套文档
复杂查询的准确性更高

缺点:

配置更复杂
存储占用更大
需要理解文档结构

4. Custom Chunking (Lambda)

4. 自定义分块（Lambda）

Best for: Specialized domain logic, custom parsing requirements

How it works: Invoke Lambda function for custom chunking logic

Configuration:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'NONE'  # Custom via Lambda
    },
    'customTransformationConfiguration': {
        'intermediateStorage': {
            's3Location': {
                'uri': 's3://my-kb-bucket/intermediate/'
            }
        },
        'transformations': [
            {
                'stepToApply': 'POST_CHUNKING',
                'transformationFunction': {
                    'transformationLambdaConfiguration': {
                        'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
                    }
                }
            }
        ]
    }
}

Example Lambda Handler:

python

undefined

最佳适用场景：专业领域逻辑、自定义解析需求

工作原理：调用Lambda函数实现自定义分块逻辑

配置示例:

python

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'NONE'  # 通过Lambda实现自定义
    },
    'customTransformationConfiguration': {
        'intermediateStorage': {
            's3Location': {
                'uri': 's3://my-kb-bucket/intermediate/'
            }
        },
        'transformations': [
            {
                'stepToApply': 'POST_CHUNKING',
                'transformationFunction': {
                    'transformationLambdaConfiguration': {
                        'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
                    }
                }
            }
        ]
    }
}

示例Lambda处理器:

python

undefined

Lambda function for custom chunking

import json

def lambda_handler(event, context): """ Custom chunking logic for specialized documents

Input: event contains document content and metadata
Output: array of chunks with text and metadata
"""

# Extract document content
document = event['document']
content = document['content']
metadata = document.get('metadata', {})

# Custom chunking logic (example: split by custom delimiter)
chunks = []
sections = content.split('---SECTION---')

for idx, section in enumerate(sections):
    if section.strip():
        chunks.append({
            'text': section.strip(),
            'metadata': {
                **metadata,
                'chunk_id': f'section_{idx}',
                'chunk_type': 'custom_section'
            }
        })

return {
    'chunks': chunks
}


**Use Cases**:
- Medical records with structured sections (SOAP notes)
- Financial documents with tables and calculations
- Code documentation with code blocks and explanations
- Domain-specific formats (HL7, FHIR, etc.)

**Pros**:
- Complete control over chunking logic
- Can handle any document format
- Integrate domain expertise

**Cons**:
- Requires Lambda development and maintenance
- Additional operational complexity
- Harder to debug and iterate

import json

def lambda_handler(event, context): """ Custom chunking logic for specialized documents

Input: event contains document content and metadata
Output: array of chunks with text and metadata
"""

# Extract document content
document = event['document']
content = document['content']
metadata = document.get('metadata', {})

# Custom chunking logic (example: split by custom delimiter)
chunks = []
sections = content.split('---SECTION---')

for idx, section in enumerate(sections):
    if section.strip():
        chunks.append({
            'text': section.strip(),
            'metadata': {
                **metadata,
                'chunk_id': f'section_{idx}',
                'chunk_type': 'custom_section'
            }
        })

return {
    'chunks': chunks
}


**适用场景**:
- 带结构化章节的医疗记录（SOAP病历）
- 带表格与计算的财务文档
- 带代码块与解释的代码文档
- 领域特定格式（HL7、FHIR等）

**优点**:
- 完全控制分块逻辑
- 可处理任意文档格式
- 集成领域专业知识

**缺点**:
- 需要Lambda开发与维护
- 额外的运维复杂度
- 调试与迭代难度更高

Chunking Strategy Selection Guide

分块策略选择指南

Document Type	Recommended Strategy	Rationale
Blog posts, articles	Fixed-size	Simple, uniform structure
Legal documents	Semantic	Preserve legal reasoning flow
Technical manuals	Hierarchical	Nested sections and subsections
Academic papers	Hierarchical	Abstract, sections, subsections
FAQs	Fixed-size	Independent Q&A pairs
Medical records	Custom Lambda	Structured sections (SOAP, HL7)
Code documentation	Custom Lambda	Code blocks + explanations
Product catalogs	Fixed-size	Uniform product descriptions
Research reports	Semantic	Preserve research narrative

文档类型	推荐策略	理由
博客文章、普通文章	固定大小分块	结构简单统一
法律文档	语义分块	保留法律推理逻辑
技术手册	分层分块	带嵌套章节与小节
学术论文	分层分块	含摘要、章节与小节
FAQ	固定大小分块	独立的问答对
医疗记录	自定义Lambda分块	结构化章节（SOAP、HL7）
代码文档	自定义Lambda分块	代码块+解释内容
产品目录	固定大小分块	统一的产品描述
研究报告	语义分块	保留研究叙事逻辑

Retrieval Operations

检索操作

1. Retrieve API (Retrieval Only)

1. Retrieve API（仅检索）

Returns raw retrieved chunks without generation.

Use Cases:

Custom generation logic
Debugging retrieval quality
Building custom RAG pipelines
Integrating with non-Bedrock models

python

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'What are the benefits of hierarchical chunking?'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 5,
            'overrideSearchType': 'HYBRID',  # SEMANTIC, HYBRID
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'technical_guide'
                        }
                    },
                    {
                        'greaterThan': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    }
                ]
            }
        }
    }
)

返回原始检索分块，不包含生成内容。

适用场景:

自定义生成逻辑
调试检索质量
构建自定义RAG流水线
与非Bedrock模型集成

python

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'What are the benefits of hierarchical chunking?'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 5,
            'overrideSearchType': 'HYBRID',  # SEMANTIC, HYBRID
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'technical_guide'
                        }
                    },
                    {
                        'greaterThan': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    }
                ]
            }
        }
    }
)

Process retrieved chunks

for result in response['retrievalResults']: print(f"Score: {result['score']}") print(f"Content: {result['content']['text']}") print(f"Location: {result['location']}") print(f"Metadata: {result.get('metadata', {})}") print("---")

undefined

undefined

2. Retrieve and Generate API (RAG)

2. Retrieve and Generate API（RAG）

Returns generated response with source attribution.

Use Cases:

Complete RAG workflows
Question answering
Document summarization
Chatbots with knowledge bases

python

response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Explain semantic chunking benefits and when to use it'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            },
            'generationConfiguration': {
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'temperature': 0.7,
                        'maxTokens': 2048,
                        'topP': 0.9
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.

Context: $search_results$

Question: $query$

Answer:'''
                }
            }
        }
    }
)

print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']}")
        print(f"    Relevance Score: {reference.get('score', 'N/A')}")

返回带来源归因的生成响应。

适用场景:

完整RAG工作流
问答场景
文档摘要
集成知识库的聊天机器人

python

response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Explain semantic chunking benefits and when to use it'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            },
            'generationConfiguration': {
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'temperature': 0.7,
                        'maxTokens': 2048,
                        'topP': 0.9
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.

Context: $search_results$

Question: $query$

Answer:'''
                }
            }
        }
    }
)

print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']}")
        print(f"    Relevance Score: {reference.get('score', 'N/A')}")

3. Multi-Turn Conversations with Session Management

3. 带会话管理的多轮对话

Bedrock automatically manages conversation context across turns.

python

undefined

Bedrock会自动管理多轮对话的上下文。

python

undefined

First turn - creates session automatically

第一轮 - 自动创建会话

response1 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What is Amazon Bedrock Knowledge Bases?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } } )

session_id = response1['sessionId'] print(f"Session ID: {session_id}") print(f"Response: {response1['output']['text']}\n")

Follow-up turn - reuse session for context

后续轮次 - 复用会话以保留上下文

response2 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What chunking strategies does it support?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } }, sessionId=session_id # Continue conversation with context )

print(f"Follow-up Response: {response2['output']['text']}")

Third turn

第三轮

response3 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'Which strategy would you recommend for legal documents?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } }, sessionId=session_id )

print(f"Third Response: {response3['output']['text']}")

undefined

print(f"Third Response: {response3['output']['text']}")

undefined

4. Advanced Metadata Filtering

4. 高级元数据过滤

Filter retrieval by metadata attributes for precision.

python

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'Security best practices for production deployments'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 10,
            'overrideSearchType': 'HYBRID',
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'security_guide'
                        }
                    },
                    {
                        'greaterThanOrEquals': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    },
                    {
                        'in': {
                            'key': 'category',
                            'value': ['production', 'security', 'compliance']
                        }
                    }
                ]
            }
        }
    }
)

Supported Filter Operators:

```
equals
```
: Exact match
```
notEquals
```
: Not equal
```
greaterThan
```
,
```
greaterThanOrEquals
```
: Numeric comparison
```
lessThan
```
,
```
lessThanOrEquals
```
: Numeric comparison
```
in
```
: Match any value in array
```
notIn
```
: Not match any value in array
```
startsWith
```
: String prefix match
```
andAll
```
: Combine filters with AND
```
orAll
```
: Combine filters with OR

通过元数据属性过滤检索结果以提升精准度。

python

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'Security best practices for production deployments'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 10,
            'overrideSearchType': 'HYBRID',
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'security_guide'
                        }
                    },
                    {
                        'greaterThanOrEquals': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    },
                    {
                        'in': {
                            'key': 'category',
                            'value': ['production', 'security', 'compliance']
                        }
                    }
                ]
            }
        }
    }
)

支持的过滤操作符:

```
equals
```
: 精确匹配
```
notEquals
```
: 不匹配
```
greaterThan
```
,
```
greaterThanOrEquals
```
: 数值比较
```
lessThan
```
,
```
lessThanOrEquals
```
: 数值比较
```
in
```
: 匹配数组中的任意值
```
notIn
```
: 不匹配数组中的任意值
```
startsWith
```
: 字符串前缀匹配
```
andAll
```
: 用AND组合多个过滤条件
```
orAll
```
: 用OR组合多个过滤条件

Ingestion Management

导入管理

1. Start Ingestion Job

1. 启动导入任务

python

ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Monthly document sync',
    clientToken='unique-idempotency-token-123'
)

job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")

python

ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Monthly document sync',
    clientToken='unique-idempotency-token-123'
)

job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")

2. Monitor Ingestion Job

2. 监控导入任务

python

undefined

python

undefined

Get job status

获取任务状态

job_status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )

print(f"Status: {job_status['ingestionJob']['status']}") print(f"Started: {job_status['ingestionJob']['startedAt']}") print(f"Updated: {job_status['ingestionJob']['updatedAt']}")

if 'statistics' in job_status['ingestionJob']: stats = job_status['ingestionJob']['statistics'] print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}") print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}") print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")

job_status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )

print(f"Status: {job_status['ingestionJob']['status']}") print(f"Started: {job_status['ingestionJob']['startedAt']}") print(f"Updated: {job_status['ingestionJob']['updatedAt']}")

Wait for completion

等待任务完成

import time

while True: status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )

current_status = status['ingestionJob']['status']

if current_status in ['COMPLETE', 'FAILED']:
    print(f"Ingestion job {current_status}")
    break

print(f"Status: {current_status}, waiting...")
time.sleep(30)

undefined

import time

while True: status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )

current_status = status['ingestionJob']['status']

if current_status in ['COMPLETE', 'FAILED']:
    print(f"Ingestion job {current_status}")
    break

print(f"Status: {current_status}, waiting...")
time.sleep(30)

undefined

3. List Ingestion Jobs

3. 列出导入任务

python

list_response = bedrock_agent.list_ingestion_jobs(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    maxResults=50
)

for job in list_response['ingestionJobSummaries']:
    print(f"Job ID: {job['ingestionJobId']}")
    print(f"Status: {job['status']}")
    print(f"Started: {job['startedAt']}")
    print(f"Updated: {job['updatedAt']}")
    print("---")

python

list_response = bedrock_agent.list_ingestion_jobs(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    maxResults=50
)

for job in list_response['ingestionJobSummaries']:
    print(f"Job ID: {job['ingestionJobId']}")
    print(f"Status: {job['status']}")
    print(f"Started: {job['startedAt']}")
    print(f"Updated: {job['updatedAt']}")
    print("---")

Integration with Bedrock Agents

与Bedrock Agent集成

1. Agent with Knowledge Base Action

1. 集成知识库的Agent

python

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

python

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

Create agent with knowledge base

创建集成知识库的Agent

agent_response = bedrock_agent.create_agent( agentName='customer-support-agent', description='Customer support agent with knowledge base access', instruction='''You are a customer support agent. When answering questions:

Search the knowledge base for relevant information
Provide accurate answers based on retrieved context
Cite your sources
Admit when you don't know something''', foundationModel='anthropic.claude-3-sonnet-20240229-v1:0', agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole' )

agent_id = agent_response['agent']['agentId']

Search the knowledge base for relevant information
Provide accurate answers based on retrieved context
Cite your sources
Admit when you don't know something''', foundationModel='anthropic.claude-3-sonnet-20240229-v1:0', agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole' )

agent_id = agent_response['agent']['agentId']

Associate knowledge base with agent

关联知识库与Agent

kb_association = bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB123456', description='Company documentation knowledge base', knowledgeBaseState='ENABLED' )

Prepare and create alias

准备并创建别名

bedrock_agent.prepare_agent(agentId=agent_id)

alias_response = bedrock_agent.create_agent_alias( agentId=agent_id, agentAliasName='production', description='Production alias' )

agent_alias_id = alias_response['agentAlias']['agentAliasId']

bedrock_agent.prepare_agent(agentId=agent_id)

alias_response = bedrock_agent.create_agent_alias( agentId=agent_id, agentAliasName='production', description='Production alias' )

agent_alias_id = alias_response['agentAlias']['agentAliasId']

Invoke agent (automatically queries knowledge base)

调用Agent（自动查询知识库）

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.invoke_agent( agentId=agent_id, agentAliasId=agent_alias_id, sessionId='session-123', inputText='What is our return policy for defective products?' )

for event in response['completion']: if 'chunk' in event: chunk = event['chunk'] print(chunk['bytes'].decode())

undefined

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.invoke_agent( agentId=agent_id, agentAliasId=agent_alias_id, sessionId='session-123', inputText='What is our return policy for defective products?' )

for event in response['completion']: if 'chunk' in event: chunk = event['chunk'] print(chunk['bytes'].decode())

undefined

2. Agent with Multiple Knowledge Bases

2. 集成多知识库的Agent

python

undefined

python

undefined

Associate multiple knowledge bases

关联多个知识库

bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-PRODUCT-DOCS', description='Product documentation' )

bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-SUPPORT-ARTICLES', description='Support knowledge articles' )

bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-COMPANY-POLICIES', description='Company policies and procedures' )

bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-PRODUCT-DOCS', description='Product documentation' )

bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-SUPPORT-ARTICLES', description='Support knowledge articles' )

bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-COMPANY-POLICIES', description='Company policies and procedures' )

Agent automatically searches all knowledge bases and combines results

Agent会自动搜索所有知识库并合并结果

---

---

Best Practices

最佳实践

1. Chunking Strategy Selection

1. 分块策略选择

Decision Framework:

Simple, uniform documents → Fixed-size chunking
- Blog posts, articles, simple FAQs
- Fast, predictable, cost-effective
Documents without clear boundaries → Semantic chunking
- Legal documents, contracts, academic papers
- Preserves semantic meaning, better accuracy
- Consider additional cost
Nested, hierarchical documents → Hierarchical chunking
- Technical manuals, product docs, research papers
- Best balance of precision and context
- Optimal for complex structures
Specialized formats → Custom Lambda chunking
- Medical records (HL7, FHIR), code docs, custom formats
- Complete control, domain expertise
- Higher operational complexity

Tuning Guidelines:

Fixed-size: Start with 512 tokens, 20% overlap
Semantic: Start with 300 tokens, bufferSize=1, threshold=95%
Hierarchical: Parent 1500 tokens, child 300 tokens, overlap 60 tokens
Custom: Test extensively with domain experts

决策框架:

简单、统一结构文档 → 固定大小分块
- 博客文章、普通文章、简单FAQ
- 快速、可预测、成本低
无明确边界的文档 → 语义分块
- 法律文档、合同、学术论文
- 保留语义含义，准确性更高
- 需考虑额外成本
嵌套、分层结构文档 → 分层分块
- 技术手册、产品文档、研究论文
- 精准度与上下文的最优平衡
- 适用于复杂结构
专业格式文档 → 自定义Lambda分块
- 医疗记录（HL7、FHIR）、代码文档、自定义格式
- 完全可控，集成领域知识
- 运维复杂度更高

调优指南:

固定大小分块：从512 token、20%重叠开始
语义分块：从300 token、bufferSize=1、阈值95%开始
分层分块：父分块1500 token、子分块300 token、重叠60 token
自定义分块：与领域专家协作进行充分测试

2. Retrieval Optimization

2. 检索优化

Number of Results:

Start with 5-10 results
Increase if answers lack detail
Decrease if too much noise

Search Type:

SEMANTIC: Pure vector similarity (faster, good for conceptual queries)
HYBRID: Vector + keyword (better recall, recommended for production)

Use Hybrid Search when:

Queries contain specific terms or names
Need to match exact keywords
Domain has specialized vocabulary

Use Semantic Search when:

Purely conceptual queries
Prioritizing speed over perfect recall
Well-embedded domain knowledge

Metadata Filters:

Always use when applicable
Dramatically improves precision
Reduces retrieval latency
Examples: document_type, publish_date, category, author

结果数量:

从5-10个结果开始
若答案缺乏细节则增加数量
若结果噪音过多则减少数量

搜索类型:

SEMANTIC：纯向量相似度（更快，适用于概念类查询）
HYBRID：向量+关键词（召回率更高，生产环境推荐）

使用混合搜索的场景:

查询包含特定术语或名称
需要匹配精确关键词
领域含专业词汇

使用语义搜索的场景:

纯概念类查询
优先考虑速度而非完美召回率
领域知识已充分嵌入

元数据过滤:

适用时务必使用
大幅提升精准度
降低检索延迟
示例：document_type、publish_date、category、author

3. Cost Optimization

3. 成本优化

S3 Vectors:

Use for large-scale knowledge bases (millions of chunks)
Up to 90% cost savings vs. OpenSearch
Ideal for cost-sensitive applications
Trade-off: Slightly higher latency

Semantic Chunking:

Incurs foundation model costs during ingestion
Consider cost vs. accuracy benefit
May not be worth it for simple documents
Best for complex, high-value content

Ingestion Frequency:

Schedule ingestion during off-peak hours
Use incremental updates when possible
Don't re-ingest unchanged documents

Model Selection:

Use smaller embedding models when accuracy permits
Titan Embed Text v2 is cost-effective
Consider Cohere Embed for multilingual

Token Usage:

Monitor generation token usage
Set appropriate maxTokens limits
Use prompt templates to control verbosity

S3 Vectors:

适用于大规模知识库（数百万分块）
相比OpenSearch节省高达90%的成本
适用于对成本敏感的应用
权衡点：延迟略高

语义分块:

导入阶段会产生基础模型费用
需权衡成本与准确性收益
简单文档可能不值得额外成本
适用于复杂、高价值内容

导入频率:

在非高峰时段调度导入任务
尽可能使用增量更新
不要重新导入未修改的文档

模型选择:

在准确性允许的情况下使用更小的嵌入模型
Titan Embed Text v2性价比高
多语言场景可考虑Cohere Embed

Token使用:

监控生成token的使用量
设置合理的maxTokens限制
使用提示模板控制输出冗长程度

4. Session Management

4. 会话管理

Always Reuse Sessions:

Pass
```
sessionId
```
for follow-up turns
Bedrock handles context automatically
No manual conversation history needed

Session Lifecycle:

Sessions expire after inactivity (default: 60 minutes)
Create new session for unrelated conversations
Use unique sessionId per user/conversation

Context Limits:

Monitor conversation length
Long sessions may hit context limits
Consider summarization for very long conversations

务必复用会话:

后续轮次传递
```
sessionId
```
Bedrock自动处理上下文
无需手动维护对话历史

会话生命周期:

会话在闲置后过期（默认：60分钟）
无关对话创建新会话
为每个用户/对话使用唯一的sessionId

上下文限制:

监控对话长度
长会话可能触发上下文限制
超长对话可考虑摘要处理

5. GraphRAG with Neptune

5. 基于Neptune的GraphRAG

When to Use:

Interconnected knowledge domains
Relationship-aware queries
Need for explainability
Complex knowledge graphs

Benefits:

Automatic graph creation
Improved accuracy through relationships
Comprehensive answers
Explainable results

Considerations:

Higher setup complexity
Neptune Analytics costs
Best for domains with rich relationships

适用场景:

互联知识领域
关系感知类查询
需要可解释性
复杂知识图谱

优势:

自动创建图谱
通过关系提升准确性
生成全面回答
结果可解释

注意事项:

配置复杂度更高
Neptune Analytics有额外成本
适用于关系丰富的领域

6. Data Source Management

6. 数据源管理

S3 Best Practices:

Organize with clear prefixes
Use inclusion/exclusion filters
Maintain consistent metadata
Version documents when updating

Web Crawler:

Set appropriate rate limits
Use robots.txt for guidance
Monitor for broken links
Schedule regular re-crawls

Confluence/SharePoint:

Filter by spaces/sites
Exclude archived content
Use fine-grained permissions
Schedule incremental syncs

Metadata Enrichment:

Add custom metadata to documents
Include: document_type, publish_date, category, author, version
Enables powerful filtering
Improves retrieval precision

S3最佳实践:

用清晰的前缀组织内容
使用包含/排除过滤
保持元数据一致性
更新文档时使用版本控制

网页爬虫:

设置合理的速率限制
参考robots.txt
监控失效链接
定期调度重新抓取

Confluence/SharePoint:

按空间/站点过滤
排除归档内容
使用细粒度权限
调度增量同步

元数据增强:

为文档添加自定义元数据
包含：document_type、publish_date、category、author、version
实现强大的过滤能力
提升检索精准度

7. Monitoring and Debugging

7. 监控与调试

Enable CloudWatch Logs:

python

undefined

启用CloudWatch日志:

python

undefined

Monitor retrieval quality

监控检索质量

Track: query latency, retrieval scores, generation quality

跟踪：查询延迟、检索分数、生成质量

Set alarms for: high latency, low scores, high error rates

设置告警：高延迟、低分数、高错误率


**Test Retrieval Quality**:
```python


**测试检索质量**:
```python

Use retrieve API to debug

使用retrieve API调试

response = bedrock_agent_runtime.retrieve( knowledgeBaseId='KB123456', retrievalQuery={'text': 'test query'} )

Analyze retrieval scores

分析检索分数

for result in response['retrievalResults']: print(f"Score: {result['score']}") print(f"Content preview: {result['content']['text'][:200]}")


**Common Issues**:

1. **Low Retrieval Scores**:
   - Check chunking strategy
   - Verify embedding model
   - Ensure documents are properly ingested
   - Consider semantic or hierarchical chunking

2. **Irrelevant Results**:
   - Add metadata filters
   - Use hybrid search
   - Refine chunking strategy
   - Increase numberOfResults

3. **Missing Information**:
   - Verify data source configuration
   - Check ingestion job status
   - Ensure documents are not excluded by filters
   - Increase numberOfResults

4. **Slow Retrieval**:
   - Use metadata filters to narrow scope
   - Optimize vector database configuration
   - Consider S3 Vectors for cost over latency
   - Reduce numberOfResults

for result in response['retrievalResults']: print(f"Score: {result['score']}") print(f"Content preview: {result['content']['text'][:200]}")


**常见问题**:

1. **检索分数低**:
   - 检查分块策略
   - 验证嵌入模型
   - 确保文档已正确导入
   - 考虑使用语义或分层分块

2. **结果不相关**:
   - 添加元数据过滤
   - 使用混合搜索
   - 优化分块策略
   - 增加结果数量

3. **信息缺失**:
   - 验证数据源配置
   - 检查导入任务状态
   - 确保文档未被过滤排除
   - 增加结果数量

4. **检索缓慢**:
   - 使用元数据过滤缩小范围
   - 优化向量数据库配置
   - 考虑用S3 Vectors平衡成本与延迟
   - 减少结果数量

8. Security Best Practices

8. 安全最佳实践

IAM Permissions:

Use least privilege for Knowledge Base role
Separate roles for data sources, ingestion, retrieval
Enable VPC endpoints for private connectivity

Data Encryption:

All data encrypted at rest (AWS KMS)
Data encrypted in transit (TLS)
Use customer-managed KMS keys for compliance

Access Control:

Use IAM policies to control who can query
Implement fine-grained access control
Monitor access with CloudTrail

PII Handling:

Use Bedrock Guardrails for PII redaction
Implement data masking for sensitive fields
Consider custom Lambda for advanced PII handling

IAM权限:

为知识库角色使用最小权限原则
为数据源、导入、检索分别设置独立角色
启用VPC端点实现私有连接

数据加密:

所有数据静态加密（AWS KMS）
传输中数据加密（TLS）
为合规性使用客户管理的KMS密钥

访问控制:

使用IAM策略控制查询权限
实现细粒度访问控制
用CloudTrail监控访问

PII处理:

使用Bedrock Guardrails进行PII脱敏
对敏感字段实现数据掩码
考虑用自定义Lambda实现高级PII处理

Complete Production Example

完整生产示例

End-to-End RAG Application

端到端RAG应用

python

import boto3
import json
from typing import List, Dict, Optional

class BedrockKnowledgeBaseRAG:
    """Production RAG application with Amazon Bedrock Knowledge Bases"""

    def __init__(self, region_name: str = 'us-east-1'):
        self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
        self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)

    def create_knowledge_base(
        self,
        name: str,
        description: str,
        role_arn: str,
        vector_store_config: Dict,
        embedding_model: str = 'amazon.titan-embed-text-v2:0'
    ) -> str:
        """Create knowledge base with vector store"""

        response = self.bedrock_agent.create_knowledge_base(
            name=name,
            description=description,
            roleArn=role_arn,
            knowledgeBaseConfiguration={
                'type': 'VECTOR',
                'vectorKnowledgeBaseConfiguration': {
                    'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
                }
            },
            storageConfiguration=vector_store_config
        )

        return response['knowledgeBase']['knowledgeBaseId']

    def add_s3_data_source(
        self,
        knowledge_base_id: str,
        name: str,
        bucket_arn: str,
        inclusion_prefixes: List[str],
        chunking_strategy: str = 'FIXED_SIZE',
        chunking_config: Optional[Dict] = None
    ) -> str:
        """Add S3 data source with chunking configuration"""

        if chunking_config is None:
            chunking_config = {
                'maxTokens': 512,
                'overlapPercentage': 20
            }

        vector_ingestion_config = {
            'chunkingConfiguration': {
                'chunkingStrategy': chunking_strategy
            }
        }

        if chunking_strategy == 'FIXED_SIZE':
            vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'SEMANTIC':
            vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'HIERARCHICAL':
            vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config

        response = self.bedrock_agent.create_data_source(
            knowledgeBaseId=knowledge_base_id,
            name=name,
            description=f'S3 data source: {name}',
            dataSourceConfiguration={
                'type': 'S3',
                's3Configuration': {
                    'bucketArn': bucket_arn,
                    'inclusionPrefixes': inclusion_prefixes
                }
            },
            vectorIngestionConfiguration=vector_ingestion_config
        )

        return response['dataSource']['dataSourceId']

    def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
        """Start ingestion job and wait for completion"""

        import time

        # Start ingestion
        response = self.bedrock_agent.start_ingestion_job(
            knowledgeBaseId=knowledge_base_id,
            dataSourceId=data_source_id,
            description='Automated ingestion'
        )

        job_id = response['ingestionJob']['ingestionJobId']

        # Wait for completion
        while True:
            status_response = self.bedrock_agent.get_ingestion_job(
                knowledgeBaseId=knowledge_base_id,
                dataSourceId=data_source_id,
                ingestionJobId=job_id
            )

            status = status_response['ingestionJob']['status']

            if status == 'COMPLETE':
                print(f"Ingestion completed successfully")
                if 'statistics' in status_response['ingestionJob']:
                    stats = status_response['ingestionJob']['statistics']
                    print(f"Documents indexed: {stats.get('numberOfDocumentsIndexed', 0)}")
                break
            elif status == 'FAILED':
                print(f"Ingestion failed")
                break

            print(f"Ingestion status: {status}")
            time.sleep(30)

        return job_id

    def query(
        self,
        knowledge_base_id: str,
        query: str,
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
        num_results: int = 5,
        search_type: str = 'HYBRID',
        metadata_filter: Optional[Dict] = None,
        session_id: Optional[str] = None
    ) -> Dict:
        """Query knowledge base with retrieve and generate"""

        retrieval_config = {
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': knowledge_base_id,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': num_results,
                        'overrideSearchType': search_type
                    }
                },
                'generationConfiguration': {
                    'inferenceConfig': {
                        'textInferenceConfig': {
                            'temperature': 0.7,
                            'maxTokens': 2048
                        }
                    }
                }
            }
        }

        # Add metadata filter if provided
        if metadata_filter:
            retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter

        # Build request
        request = {
            'input': {'text': query},
            'retrieveAndGenerateConfiguration': retrieval_config
        }

        # Add session if provided
        if session_id:
            request['sessionId'] = session_id

        response = self.bedrock_agent_runtime.retrieve_and_generate(**request)

        return {
            'answer': response['output']['text'],
            'citations': response.get('citations', []),
            'session_id': response['sessionId']
        }

    def multi_turn_conversation(
        self,
        knowledge_base_id: str,
        queries: List[str],
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
    ) -> List[Dict]:
        """Execute multi-turn conversation with context"""

        session_id = None
        conversation = []

        for query in queries:
            result = self.query(
                knowledge_base_id=knowledge_base_id,
                query=query,
                model_arn=model_arn,
                session_id=session_id
            )

            session_id = result['session_id']

            conversation.append({
                'query': query,
                'answer': result['answer'],
                'citations': result['citations']
            })

        return conversation

python

import boto3
import json
from typing import List, Dict, Optional

class BedrockKnowledgeBaseRAG:
    """基于Amazon Bedrock知识库的生产级RAG应用"""

    def __init__(self, region_name: str = 'us-east-1'):
        self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
        self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)

    def create_knowledge_base(
        self,
        name: str,
        description: str,
        role_arn: str,
        vector_store_config: Dict,
        embedding_model: str = 'amazon.titan-embed-text-v2:0'
    ) -> str:
        """创建带向量存储的知识库"""

        response = self.bedrock_agent.create_knowledge_base(
            name=name,
            description=description,
            roleArn=role_arn,
            knowledgeBaseConfiguration={
                'type': 'VECTOR',
                'vectorKnowledgeBaseConfiguration': {
                    'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
                }
            },
            storageConfiguration=vector_store_config
        )

        return response['knowledgeBase']['knowledgeBaseId']

    def add_s3_data_source(
        self,
        knowledge_base_id: str,
        name: str,
        bucket_arn: str,
        inclusion_prefixes: List[str],
        chunking_strategy: str = 'FIXED_SIZE',
        chunking_config: Optional[Dict] = None
    ) -> str:
        """添加带分块配置的S3数据源"""

        if chunking_config is None:
            chunking_config = {
                'maxTokens': 512,
                'overlapPercentage': 20
            }

        vector_ingestion_config = {
            'chunkingConfiguration': {
                'chunkingStrategy': chunking_strategy
            }
        }

        if chunking_strategy == 'FIXED_SIZE':
            vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'SEMANTIC':
            vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'HIERARCHICAL':
            vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config

        response = self.bedrock_agent.create_data_source(
            knowledgeBaseId=knowledge_base_id,
            name=name,
            description=f'S3 data source: {name}',
            dataSourceConfiguration={
                'type': 'S3',
                's3Configuration': {
                    'bucketArn': bucket_arn,
                    'inclusionPrefixes': inclusion_prefixes
                }
            },
            vectorIngestionConfiguration=vector_ingestion_config
        )

        return response['dataSource']['dataSourceId']

    def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
        """启动导入任务并等待完成"""

        import time

        # 启动导入
        response = self.bedrock_agent.start_ingestion_job(
            knowledgeBaseId=knowledge_base_id,
            dataSourceId=data_source_id,
            description='Automated ingestion'
        )

        job_id = response['ingestionJob']['ingestionJobId']

        # 等待完成
        while True:
            status_response = self.bedrock_agent.get_ingestion_job(
                knowledgeBaseId=knowledge_base_id,
                dataSourceId=data_source_id,
                ingestionJobId=job_id
            )

            status = status_response['ingestionJob']['status']

            if status == 'COMPLETE':
                print(f"导入任务成功完成")
                if 'statistics' in status_response['ingestionJob']:
                    stats = status_response['ingestionJob']['statistics']
                    print(f"已扫描文档数: {stats.get('numberOfDocumentsScanned', 0)}")
                break
            elif status == 'FAILED':
                print(f"导入任务失败")
                break

            print(f"导入任务状态: {status}，等待中...")
            time.sleep(30)

        return job_id

    def query(
        self,
        knowledge_base_id: str,
        query: str,
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
        num_results: int = 5,
        search_type: str = 'HYBRID',
        metadata_filter: Optional[Dict] = None,
        session_id: Optional[str] = None
    ) -> Dict:
        """通过检索与生成API查询知识库"""

        retrieval_config = {
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': knowledge_base_id,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': num_results,
                        'overrideSearchType': search_type
                    }
                },
                'generationConfiguration': {
                    'inferenceConfig': {
                        'textInferenceConfig': {
                            'temperature': 0.7,
                            'maxTokens': 2048
                        }
                    }
                }
            }
        }

        # 添加元数据过滤（如果提供）
        if metadata_filter:
            retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter

        # 构建请求
        request = {
            'input': {'text': query},
            'retrieveAndGenerateConfiguration': retrieval_config
        }

        # 添加会话（如果提供）
        if session_id:
            request['sessionId'] = session_id

        response = self.bedrock_agent_runtime.retrieve_and_generate(**request)

        return {
            'answer': response['output']['text'],
            'citations': response.get('citations', []),
            'session_id': response['sessionId']
        }

    def multi_turn_conversation(
        self,
        knowledge_base_id: str,
        queries: List[str],
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
    ) -> List[Dict]:
        """执行带上下文的多轮对话"""

        session_id = None
        conversation = []

        for query in queries:
            result = self.query(
                knowledge_base_id=knowledge_base_id,
                query=query,
                model_arn=model_arn,
                session_id=session_id
            )

            session_id = result['session_id']

            conversation.append({
                'query': query,
                'answer': result['answer'],
                'citations': result['citations']
            })

        return conversation

Example Usage

示例用法

if name == 'main': rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')

# Create knowledge base
kb_id = rag.create_knowledge_base(
    name='production-docs-kb',
    description='Production documentation knowledge base',
    role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
    vector_store_config={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
            'vectorIndexName': 'bedrock-kb-index',
            'fieldMapping': {
                'vectorField': 'bedrock-knowledge-base-default-vector',
                'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                'metadataField': 'AMAZON_BEDROCK_METADATA'
            }
        }
    }
)

# Add data source
ds_id = rag.add_s3_data_source(
    knowledge_base_id=kb_id,
    name='technical-docs',
    bucket_arn='arn:aws:s3:::my-docs-bucket',
    inclusion_prefixes=['docs/'],
    chunking_strategy='HIERARCHICAL',
    chunking_config={
        'levelConfigurations': [
            {'maxTokens': 1500},
            {'maxTokens': 300}
        ],
        'overlapTokens': 60
    }
)

# Ingest data
rag.ingest_data(kb_id, ds_id)

# Single query
result = rag.query(
    knowledge_base_id=kb_id,
    query='What are the best practices for RAG applications?',
    metadata_filter={
        'equals': {
            'key': 'document_type',
            'value': 'best_practices'
        }
    }
)

print(f"Answer: {result['answer']}")
print(f"\nSources:")
for citation in result['citations']:
    for ref in citation['retrievedReferences']:
        print(f"  - {ref['location']}")

# Multi-turn conversation
conversation = rag.multi_turn_conversation(
    knowledge_base_id=kb_id,
    queries=[
        'What is hierarchical chunking?',
        'When should I use it?',
        'What are the configuration parameters?'
    ]
)

for turn in conversation:
    print(f"\nQ: {turn['query']}")
    print(f"A: {turn['answer']}")

---

if name == 'main': rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')

# 创建知识库
kb_id = rag.create_knowledge_base(
    name='production-docs-kb',
    description='生产文档知识库',
    role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
    vector_store_config={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
            'vectorIndexName': 'bedrock-kb-index',
            'fieldMapping': {
                'vectorField': 'bedrock-knowledge-base-default-vector',
                'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                'metadataField': 'AMAZON_BEDROCK_METADATA'
            }
        }
    }
)

# 添加数据源
ds_id = rag.add_s3_data_source(
    knowledge_base_id=kb_id,
    name='technical-docs',
    bucket_arn='arn:aws:s3:::my-docs-bucket',
    inclusion_prefixes=['docs/'],
    chunking_strategy='HIERARCHICAL',
    chunking_config={
        'levelConfigurations': [
            {'maxTokens': 1500},
            {'maxTokens': 300}
        ],
        'overlapTokens': 60
    }
)

# 导入数据
rag.ingest_data(kb_id, ds_id)

# 单查询
result = rag.query(
    knowledge_base_id=kb_id,
    query='RAG应用的最佳实践有哪些？',
    metadata_filter={
        'equals': {
            'key': 'document_type',
            'value': 'best_practices'
        }
    }
)

print(f"答案: {result['answer']}")
print(f"\n来源:")
for citation in result['citations']:
    for ref in citation['retrievedReferences']:
        print(f"  - {ref['location']}")

# 多轮对话
conversation = rag.multi_turn_conversation(
    knowledge_base_id=kb_id,
    queries=[
        '什么是分层分块？',
        '什么时候应该使用它？',
        '配置参数有哪些？'
    ]
)

for turn in conversation:
    print(f"\n问: {turn['query']}")
    print(f"答: {turn['answer']}")

---

Related Skills

Additional Resources

额外资源

Official Documentation

官方文档

Best Practices

最佳实践

Research Document

研究文档

/mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md

- Section 2 (Complete Knowledge Bases research)

/mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md

- 第2部分（完整知识库研究）