bedrock-knowledge-bases
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAmazon Bedrock Knowledge Bases
Amazon Bedrock 知识库
Amazon Bedrock Knowledge Bases is a fully managed RAG (Retrieval-Augmented Generation) solution that handles data ingestion, embedding generation, vector storage, retrieval with reranking, source attribution, and session context management.
Amazon Bedrock 知识库是一款全托管的RAG(检索增强生成)解决方案,可处理数据导入、嵌入生成、向量存储、带重排序的检索、来源归因以及会话上下文管理等工作。
Overview
概述
What It Does
核心功能
Amazon Bedrock Knowledge Bases provides:
- Data Ingestion: Automatically process documents from S3, web, Confluence, SharePoint, Salesforce
- Embedding Generation: Convert text to vectors using foundation models
- Vector Storage: Store embeddings in multiple vector database options
- Retrieval: Semantic and hybrid search with metadata filtering
- Generation: RAG workflows with source attribution
- Session Management: Multi-turn conversations with context
- Chunking Strategies: Fixed, semantic, hierarchical, and custom chunking
Amazon Bedrock 知识库提供以下能力:
- 数据导入:自动处理来自S3、网页、Confluence、SharePoint、Salesforce的文档
- 嵌入生成:使用基础模型将文本转换为向量
- 向量存储:支持多种向量数据库选项存储嵌入向量
- 检索能力:支持语义搜索与混合搜索,可结合元数据过滤
- 生成能力:带来源归因的RAG工作流
- 会话管理:支持带上下文的多轮对话
- 分块策略:固定大小、语义、分层及自定义分块
When to Use This Skill
适用场景
Use this skill when you need to:
- Build RAG applications for document Q&A
- Implement semantic search over enterprise knowledge
- Create chatbots with knowledge bases
- Integrate retrieval with Bedrock Agents
- Configure optimal chunking strategies
- Query documents with source attribution
- Manage multi-turn conversations with context
- Optimize RAG performance and cost
在以下场景中可使用该能力:
- 构建面向文档问答的RAG应用
- 针对企业知识实现语义搜索
- 创建集成知识库的聊天机器人
- 将检索能力与Bedrock Agent集成
- 配置最优分块策略
- 带来源归因的文档查询
- 管理带上下文的多轮对话
- 优化RAG的性能与成本
Key Capabilities
关键特性
- Multiple Vector Store Options: OpenSearch, S3 Vectors, Neptune, Pinecone, MongoDB, Redis
- Flexible Data Sources: S3, web crawlers, Confluence, SharePoint, Salesforce
- Advanced Chunking: Fixed-size, semantic, hierarchical, custom Lambda
- Hybrid Search: Combine semantic (vector) and keyword search
- Session Management: Built-in conversation context tracking
- GraphRAG: Relationship-aware retrieval with Neptune Analytics
- Cost Optimization: S3 Vectors for up to 90% storage savings
- 多向量存储选项:OpenSearch、S3 Vectors、Neptune、Pinecone、MongoDB、Redis
- 灵活数据源:S3、网页爬虫、Confluence、SharePoint、Salesforce
- 高级分块:固定大小、语义、分层、自定义Lambda分块
- 混合搜索:结合语义(向量)与关键词搜索
- 会话管理:内置对话上下文跟踪
- GraphRAG:借助Neptune Analytics实现关系感知检索
- 成本优化:使用S3 Vectors可节省高达90%的存储成本
Quick Start
快速开始
Basic RAG Workflow
基础RAG工作流
python
import boto3
import jsonpython
import boto3
import jsonInitialize clients
初始化客户端
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
1. Create Knowledge Base
1. 创建知识库
kb_response = bedrock_agent.create_knowledge_base(
name='enterprise-docs-kb',
description='Company documentation knowledge base',
roleArn='arn:aws:iam::123456789012:role/BedrockKBRole',
knowledgeBaseConfiguration={
'type': 'VECTOR',
'vectorKnowledgeBaseConfiguration': {
'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
}
},
storageConfiguration={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-knowledge-base-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}
)
knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId']
print(f"Knowledge Base ID: {knowledge_base_id}")
kb_response = bedrock_agent.create_knowledge_base(
name='enterprise-docs-kb',
description='Company documentation knowledge base',
roleArn='arn:aws:iam::123456789012:role/BedrockKBRole',
knowledgeBaseConfiguration={
'type': 'VECTOR',
'vectorKnowledgeBaseConfiguration': {
'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
}
},
storageConfiguration={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-knowledge-base-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}
)
knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId']
print(f"Knowledge Base ID: {knowledge_base_id}")
2. Add S3 Data Source
2. 添加S3数据源
ds_response = bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='s3-documents',
description='Company documents from S3',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': 'arn:aws:s3:::my-docs-bucket',
'inclusionPrefixes': ['documents/']
}
},
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'FIXED_SIZE',
'fixedSizeChunkingConfiguration': {
'maxTokens': 512,
'overlapPercentage': 20
}
}
}
)
data_source_id = ds_response['dataSource']['dataSourceId']
ds_response = bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='s3-documents',
description='Company documents from S3',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': 'arn:aws:s3:::my-docs-bucket',
'inclusionPrefixes': ['documents/']
}
},
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'FIXED_SIZE',
'fixedSizeChunkingConfiguration': {
'maxTokens': 512,
'overlapPercentage': 20
}
}
}
)
data_source_id = ds_response['dataSource']['dataSourceId']
3. Start Ingestion
3. 启动导入任务
ingestion_response = bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Initial document ingestion'
)
print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")
ingestion_response = bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Initial document ingestion'
)
print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")
4. Query with Retrieve and Generate
4. 通过检索与生成API查询
response = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What is our vacation policy?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': knowledge_base_id,
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID'
}
}
}
}
)
print(f"Answer: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
for reference in citation['retrievedReferences']:
print(f" - {reference['location']['s3Location']['uri']}")
---response = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What is our vacation policy?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': knowledge_base_id,
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID'
}
}
}
}
)
print(f"Answer: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
for reference in citation['retrievedReferences']:
print(f" - {reference['location']['s3Location']['uri']}")
---Vector Store Options
向量存储选项
1. Amazon OpenSearch Serverless
1. Amazon OpenSearch Serverless
Best for: Production RAG applications with auto-scaling requirements
Benefits:
- Fully managed, serverless operation
- Auto-scaling compute and storage
- High availability with multi-AZ deployment
- Fast query performance
Configuration:
python
storageConfiguration={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-knowledge-base-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}最佳适用场景:具备自动扩缩容需求的生产级RAG应用
优势:
- 全托管、无服务器运行
- 计算与存储自动扩缩容
- 多AZ部署,高可用性
- 快速查询性能
配置示例:
python
storageConfiguration={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-knowledge-base-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}2. Amazon S3 Vectors (Preview)
2. Amazon S3 Vectors(预览版)
Best for: Cost-optimized, large-scale RAG applications
Benefits:
- Up to 90% cost reduction for vector storage
- Built-in vector support in S3
- Subsecond query performance
- Massive scale and durability
Ideal Use Cases:
- Large document collections (millions of chunks)
- Cost-sensitive applications
- Archival knowledge bases
- Low-to-medium QPS workloads
Configuration:
python
storageConfiguration={
'type': 'S3_VECTORS',
's3VectorsConfiguration': {
'bucketArn': 'arn:aws:s3:::my-vector-bucket',
'prefix': 'vectors/'
}
}Limitations:
- Still in preview (no CloudFormation/CDK support yet)
- Not suitable for high QPS, millisecond-latency requirements
- Best for cost optimization over ultra-low latency
最佳适用场景:成本优化型、大规模RAG应用
优势:
- 向量存储成本降低高达90%
- S3内置向量支持
- 亚秒级查询性能
- 海量扩展能力与持久性
理想场景:
- 大型文档集合(数百万分块)
- 对成本敏感的应用
- 归档类知识库
- 低至中等QPS的工作负载
配置示例:
python
storageConfiguration={
'type': 'S3_VECTORS',
's3VectorsConfiguration': {
'bucketArn': 'arn:aws:s3:::my-vector-bucket',
'prefix': 'vectors/'
}
}限制:
- 仍处于预览阶段(暂不支持CloudFormation/CDK)
- 不适用于高QPS、毫秒级延迟要求的场景
- 优先满足成本优化而非超低延迟需求
3. Amazon Neptune Analytics (GraphRAG)
3. Amazon Neptune Analytics(GraphRAG)
Best for: Interconnected knowledge domains requiring relationship-aware retrieval
Benefits:
- Automatic graph creation linking related content
- Improved retrieval accuracy through relationships
- Comprehensive responses leveraging knowledge graph
- Explainable results with relationship context
Use Cases:
- Legal document analysis with case precedents
- Scientific research with paper citations
- Product catalogs with dependencies
- Organizational knowledge with team relationships
Configuration:
python
storageConfiguration={
'type': 'NEPTUNE_ANALYTICS',
'neptuneAnalyticsConfiguration': {
'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
'vectorSearchConfiguration': {
'vectorField': 'embedding'
}
}
}最佳适用场景:需要关系感知检索的互联知识领域
优势:
- 自动创建关联相关内容的图谱
- 通过关系提升检索准确性
- 利用知识图谱生成全面响应
- 带关系上下文的可解释结果
适用场景:
- 带案例先例的法律文档分析
- 带论文引用的科研场景
- 带依赖关系的产品目录
- 带团队关系的组织知识
配置示例:
python
storageConfiguration={
'type': 'NEPTUNE_ANALYTICS',
'neptuneAnalyticsConfiguration': {
'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
'vectorSearchConfiguration': {
'vectorField': 'embedding'
}
}
}4. Amazon OpenSearch Service Managed Cluster
4. Amazon OpenSearch Service 托管集群
Best for: Existing OpenSearch infrastructure, advanced customization
Configuration:
python
storageConfiguration={
'type': 'OPENSEARCH_SERVICE',
'opensearchServiceConfiguration': {
'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}最佳适用场景:已有OpenSearch基础设施、需要高级定制的场景
配置示例:
python
storageConfiguration={
'type': 'OPENSEARCH_SERVICE',
'opensearchServiceConfiguration': {
'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}5. Third-Party Vector Databases
5. 第三方向量数据库
Pinecone:
python
storageConfiguration={
'type': 'PINECONE',
'pineconeConfiguration': {
'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
'namespace': 'bedrock-kb',
'fieldMapping': {
'textField': 'text',
'metadataField': 'metadata'
}
}
}MongoDB Atlas:
python
storageConfiguration={
'type': 'MONGODB_ATLAS',
'mongoDbAtlasConfiguration': {
'endpoint': 'https://cluster0.mongodb.net',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
'databaseName': 'bedrock_kb',
'collectionName': 'vectors',
'vectorIndexName': 'vector_index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}Redis Enterprise Cloud:
python
storageConfiguration={
'type': 'REDIS_ENTERPRISE_CLOUD',
'redisEnterpriseCloudConfiguration': {
'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}Pinecone:
python
storageConfiguration={
'type': 'PINECONE',
'pineconeConfiguration': {
'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
'namespace': 'bedrock-kb',
'fieldMapping': {
'textField': 'text',
'metadataField': 'metadata'
}
}
}MongoDB Atlas:
python
storageConfiguration={
'type': 'MONGODB_ATLAS',
'mongoDbAtlasConfiguration': {
'endpoint': 'https://cluster0.mongodb.net',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
'databaseName': 'bedrock_kb',
'collectionName': 'vectors',
'vectorIndexName': 'vector_index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}Redis Enterprise Cloud:
python
storageConfiguration={
'type': 'REDIS_ENTERPRISE_CLOUD',
'redisEnterpriseCloudConfiguration': {
'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'embedding',
'textField': 'text',
'metadataField': 'metadata'
}
}
}Data Source Configuration
数据源配置
1. Amazon S3
1. Amazon S3
Supported File Types: PDF, TXT, MD, HTML, DOC, DOCX, CSV, XLS, XLSX
python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='s3-technical-docs',
description='Technical documentation from S3',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': 'arn:aws:s3:::my-docs-bucket',
'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
'exclusionPrefixes': ['docs/archive/']
}
}
)支持文件类型:PDF、TXT、MD、HTML、DOC、DOCX、CSV、XLS、XLSX
python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='s3-technical-docs',
description='Technical documentation from S3',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': 'arn:aws:s3:::my-docs-bucket',
'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
'exclusionPrefixes': ['docs/archive/']
}
}
)2. Web Crawler
2. 网页爬虫
Automatic website scraping and indexing:
python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='company-website',
description='Public company website content',
dataSourceConfiguration={
'type': 'WEB',
'webConfiguration': {
'sourceConfiguration': {
'urlConfiguration': {
'seedUrls': [
{'url': 'https://www.example.com/docs'},
{'url': 'https://www.example.com/blog'}
]
}
},
'crawlerConfiguration': {
'crawlerLimits': {
'rateLimit': 300 # Pages per minute
}
}
}
}
)自动网站抓取与索引:
python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='company-website',
description='Public company website content',
dataSourceConfiguration={
'type': 'WEB',
'webConfiguration': {
'sourceConfiguration': {
'urlConfiguration': {
'seedUrls': [
{'url': 'https://www.example.com/docs'},
{'url': 'https://www.example.com/blog'}
]
}
},
'crawlerConfiguration': {
'crawlerLimits': {
'rateLimit': 300 # Pages per minute
}
}
}
}
)3. Confluence
3. Confluence
python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='confluence-wiki',
description='Company Confluence knowledge base',
dataSourceConfiguration={
'type': 'CONFLUENCE',
'confluenceConfiguration': {
'sourceConfiguration': {
'hostUrl': 'https://company.atlassian.net/wiki',
'hostType': 'SAAS',
'authType': 'BASIC',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
},
'crawlerConfiguration': {
'filterConfiguration': {
'type': 'PATTERN',
'patternObjectFilter': {
'filters': [
{
'objectType': 'Space',
'inclusionFilters': ['Engineering', 'Product'],
'exclusionFilters': ['Archive']
}
]
}
}
}
}
}
)python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='confluence-wiki',
description='Company Confluence knowledge base',
dataSourceConfiguration={
'type': 'CONFLUENCE',
'confluenceConfiguration': {
'sourceConfiguration': {
'hostUrl': 'https://company.atlassian.net/wiki',
'hostType': 'SAAS',
'authType': 'BASIC',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
},
'crawlerConfiguration': {
'filterConfiguration': {
'type': 'PATTERN',
'patternObjectFilter': {
'filters': [
{
'objectType': 'Space',
'inclusionFilters': ['Engineering', 'Product'],
'exclusionFilters': ['Archive']
}
]
}
}
}
}
}
)4. SharePoint
4. SharePoint
python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='sharepoint-docs',
description='SharePoint document library',
dataSourceConfiguration={
'type': 'SHAREPOINT',
'sharePointConfiguration': {
'sourceConfiguration': {
'siteUrls': [
'https://company.sharepoint.com/sites/Engineering',
'https://company.sharepoint.com/sites/Product'
],
'tenantId': 'tenant-id',
'domain': 'company',
'authType': 'OAUTH2_CLIENT_CREDENTIALS',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
}
}
}
)python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='sharepoint-docs',
description='SharePoint document library',
dataSourceConfiguration={
'type': 'SHAREPOINT',
'sharePointConfiguration': {
'sourceConfiguration': {
'siteUrls': [
'https://company.sharepoint.com/sites/Engineering',
'https://company.sharepoint.com/sites/Product'
],
'tenantId': 'tenant-id',
'domain': 'company',
'authType': 'OAUTH2_CLIENT_CREDENTIALS',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
}
}
}
)5. Salesforce
5. Salesforce
python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='salesforce-knowledge',
description='Salesforce knowledge articles',
dataSourceConfiguration={
'type': 'SALESFORCE',
'salesforceConfiguration': {
'sourceConfiguration': {
'hostUrl': 'https://company.my.salesforce.com',
'authType': 'OAUTH2_CLIENT_CREDENTIALS',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
},
'crawlerConfiguration': {
'filterConfiguration': {
'type': 'PATTERN',
'patternObjectFilter': {
'filters': [
{
'objectType': 'Knowledge',
'inclusionFilters': ['Product_Documentation', 'Support_Articles']
}
]
}
}
}
}
}
)python
bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name='salesforce-knowledge',
description='Salesforce knowledge articles',
dataSourceConfiguration={
'type': 'SALESFORCE',
'salesforceConfiguration': {
'sourceConfiguration': {
'hostUrl': 'https://company.my.salesforce.com',
'authType': 'OAUTH2_CLIENT_CREDENTIALS',
'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
},
'crawlerConfiguration': {
'filterConfiguration': {
'type': 'PATTERN',
'patternObjectFilter': {
'filters': [
{
'objectType': 'Knowledge',
'inclusionFilters': ['Product_Documentation', 'Support_Articles']
}
]
}
}
}
}
}
)Chunking Strategies
分块策略
1. Fixed-Size Chunking
1. 固定大小分块
Best for: Simple documents with uniform structure
How it works: Splits text into chunks of fixed token size with overlap
Parameters:
- : 200-8192 tokens (typically 512-1024)
maxTokens - : 10-50% (typically 20%)
overlapPercentage
Configuration:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'FIXED_SIZE',
'fixedSizeChunkingConfiguration': {
'maxTokens': 512,
'overlapPercentage': 20
}
}
}Use Cases:
- Blog posts and articles
- Technical documentation with consistent formatting
- FAQs and Q&A content
- Simple text files
Pros:
- Fast and predictable
- No additional costs
- Easy to tune
Cons:
- May split semantic units awkwardly
- Doesn't respect document structure
- Can break context mid-sentence
最佳适用场景:结构统一的简单文档
工作原理:将文本按固定token大小拆分,设置重叠部分
参数:
- : 200-8192 token(通常为512-1024)
maxTokens - : 10-50%(通常为20%)
overlapPercentage
配置示例:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'FIXED_SIZE',
'fixedSizeChunkingConfiguration': {
'maxTokens': 512,
'overlapPercentage': 20
}
}
}适用场景:
- 博客文章
- 格式一致的技术文档
- FAQ与问答内容
- 简单文本文件
优点:
- 快速且可预测
- 无额外成本
- 易于调优
缺点:
- 可能会生硬拆分语义单元
- 不尊重文档结构
- 可能在句子中间打断上下文
2. Semantic Chunking
2. 语义分块
Best for: Documents without clear boundaries (legal, technical, academic)
How it works: Uses sentence similarity to group related content
Parameters:
- : 20-8192 tokens (typically 300-500)
maxTokens - : Number of neighboring sentences (default: 1)
bufferSize - : Similarity threshold (recommended: 95%)
breakpointPercentileThreshold
Configuration:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'SEMANTIC',
'semanticChunkingConfiguration': {
'maxTokens': 300,
'bufferSize': 1,
'breakpointPercentileThreshold': 95
}
}
}Use Cases:
- Legal documents and contracts
- Academic papers
- Technical specifications
- Medical records
- Research reports
Pros:
- Preserves semantic meaning
- Better context preservation
- Improved retrieval accuracy
Cons:
- Additional cost (foundation model usage)
- Slower ingestion
- Less predictable chunk sizes
Cost Consideration: Semantic chunking uses foundation models for similarity analysis, incurring additional costs beyond storage and retrieval.
最佳适用场景:无明确边界的文档(法律、技术、学术类)
工作原理:使用句子相似度对相关内容分组
参数:
- : 20-8192 token(通常为300-500)
maxTokens - : 相邻句子数量(默认值:1)
bufferSize - : 相似度阈值(推荐值:95%)
breakpointPercentileThreshold
配置示例:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'SEMANTIC',
'semanticChunkingConfiguration': {
'maxTokens': 300,
'bufferSize': 1,
'breakpointPercentileThreshold': 95
}
}
}适用场景:
- 法律文档与合同
- 学术论文
- 技术规范
- 医疗记录
- 研究报告
优点:
- 保留语义含义
- 更好的上下文保留
- 提升检索准确性
缺点:
- 额外成本(基础模型使用费用)
- 导入速度较慢
- 分块大小较难预测
成本考量:语义分块使用基础模型进行相似度分析,除存储与检索成本外会产生额外费用。
3. Hierarchical Chunking
3. 分层分块
Best for: Complex documents with nested structure
How it works: Creates parent and child chunks; retrieves child, returns parent for context
Parameters:
- : Array of chunk sizes (parent → child)
levelConfigurations - : Overlap between chunks
overlapTokens
Configuration:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'HIERARCHICAL',
'hierarchicalChunkingConfiguration': {
'levelConfigurations': [
{
'maxTokens': 1500 # Parent chunk (comprehensive context)
},
{
'maxTokens': 300 # Child chunk (focused retrieval)
}
],
'overlapTokens': 60
}
}
}Use Cases:
- Technical manuals with sections and subsections
- Academic papers with abstract, sections, and subsections
- Legal documents with articles and clauses
- Product documentation with categories and details
How Retrieval Works:
- Query matches against child chunks (fast, focused)
- Returns parent chunks (comprehensive context)
- Best of both: precision retrieval + complete context
Pros:
- Optimal balance of precision and context
- Excellent for nested documents
- Better accuracy for complex queries
Cons:
- More complex configuration
- Larger storage footprint
- Requires understanding of document structure
最佳适用场景:带嵌套结构的复杂文档
工作原理:创建父分块与子分块;检索子分块,返回父分块以提供上下文
参数:
- : 分块大小数组(父→子)
levelConfigurations - : 分块间的重叠token数
overlapTokens
配置示例:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'HIERARCHICAL',
'hierarchicalChunkingConfiguration': {
'levelConfigurations': [
{
'maxTokens': 1500 # 父分块(完整上下文)
},
{
'maxTokens': 300 # 子分块(聚焦检索)
}
],
'overlapTokens': 60
}
}
}适用场景:
- 带章节与小节的技术手册
- 带摘要、章节与小节的学术论文
- 带条款的法律文档
- 带分类与详情的产品文档
检索工作流程:
- 查询与子分块匹配(快速、聚焦)
- 返回父分块(完整上下文)
- 兼顾精准检索与完整上下文的最优方案
优点:
- 精准度与上下文的最优平衡
- 适用于嵌套文档
- 复杂查询的准确性更高
缺点:
- 配置更复杂
- 存储占用更大
- 需要理解文档结构
4. Custom Chunking (Lambda)
4. 自定义分块(Lambda)
Best for: Specialized domain logic, custom parsing requirements
How it works: Invoke Lambda function for custom chunking logic
Configuration:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'NONE' # Custom via Lambda
},
'customTransformationConfiguration': {
'intermediateStorage': {
's3Location': {
'uri': 's3://my-kb-bucket/intermediate/'
}
},
'transformations': [
{
'stepToApply': 'POST_CHUNKING',
'transformationFunction': {
'transformationLambdaConfiguration': {
'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
}
}
}
]
}
}Example Lambda Handler:
python
undefined最佳适用场景:专业领域逻辑、自定义解析需求
工作原理:调用Lambda函数实现自定义分块逻辑
配置示例:
python
vectorIngestionConfiguration={
'chunkingConfiguration': {
'chunkingStrategy': 'NONE' # 通过Lambda实现自定义
},
'customTransformationConfiguration': {
'intermediateStorage': {
's3Location': {
'uri': 's3://my-kb-bucket/intermediate/'
}
},
'transformations': [
{
'stepToApply': 'POST_CHUNKING',
'transformationFunction': {
'transformationLambdaConfiguration': {
'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
}
}
}
]
}
}示例Lambda处理器:
python
undefinedLambda function for custom chunking
Lambda function for custom chunking
import json
def lambda_handler(event, context):
"""
Custom chunking logic for specialized documents
Input: event contains document content and metadata
Output: array of chunks with text and metadata
"""
# Extract document content
document = event['document']
content = document['content']
metadata = document.get('metadata', {})
# Custom chunking logic (example: split by custom delimiter)
chunks = []
sections = content.split('---SECTION---')
for idx, section in enumerate(sections):
if section.strip():
chunks.append({
'text': section.strip(),
'metadata': {
**metadata,
'chunk_id': f'section_{idx}',
'chunk_type': 'custom_section'
}
})
return {
'chunks': chunks
}
**Use Cases**:
- Medical records with structured sections (SOAP notes)
- Financial documents with tables and calculations
- Code documentation with code blocks and explanations
- Domain-specific formats (HL7, FHIR, etc.)
**Pros**:
- Complete control over chunking logic
- Can handle any document format
- Integrate domain expertise
**Cons**:
- Requires Lambda development and maintenance
- Additional operational complexity
- Harder to debug and iterateimport json
def lambda_handler(event, context):
"""
Custom chunking logic for specialized documents
Input: event contains document content and metadata
Output: array of chunks with text and metadata
"""
# Extract document content
document = event['document']
content = document['content']
metadata = document.get('metadata', {})
# Custom chunking logic (example: split by custom delimiter)
chunks = []
sections = content.split('---SECTION---')
for idx, section in enumerate(sections):
if section.strip():
chunks.append({
'text': section.strip(),
'metadata': {
**metadata,
'chunk_id': f'section_{idx}',
'chunk_type': 'custom_section'
}
})
return {
'chunks': chunks
}
**适用场景**:
- 带结构化章节的医疗记录(SOAP病历)
- 带表格与计算的财务文档
- 带代码块与解释的代码文档
- 领域特定格式(HL7、FHIR等)
**优点**:
- 完全控制分块逻辑
- 可处理任意文档格式
- 集成领域专业知识
**缺点**:
- 需要Lambda开发与维护
- 额外的运维复杂度
- 调试与迭代难度更高Chunking Strategy Selection Guide
分块策略选择指南
| Document Type | Recommended Strategy | Rationale |
|---|---|---|
| Blog posts, articles | Fixed-size | Simple, uniform structure |
| Legal documents | Semantic | Preserve legal reasoning flow |
| Technical manuals | Hierarchical | Nested sections and subsections |
| Academic papers | Hierarchical | Abstract, sections, subsections |
| FAQs | Fixed-size | Independent Q&A pairs |
| Medical records | Custom Lambda | Structured sections (SOAP, HL7) |
| Code documentation | Custom Lambda | Code blocks + explanations |
| Product catalogs | Fixed-size | Uniform product descriptions |
| Research reports | Semantic | Preserve research narrative |
| 文档类型 | 推荐策略 | 理由 |
|---|---|---|
| 博客文章、普通文章 | 固定大小分块 | 结构简单统一 |
| 法律文档 | 语义分块 | 保留法律推理逻辑 |
| 技术手册 | 分层分块 | 带嵌套章节与小节 |
| 学术论文 | 分层分块 | 含摘要、章节与小节 |
| FAQ | 固定大小分块 | 独立的问答对 |
| 医疗记录 | 自定义Lambda分块 | 结构化章节(SOAP、HL7) |
| 代码文档 | 自定义Lambda分块 | 代码块+解释内容 |
| 产品目录 | 固定大小分块 | 统一的产品描述 |
| 研究报告 | 语义分块 | 保留研究叙事逻辑 |
Retrieval Operations
检索操作
1. Retrieve API (Retrieval Only)
1. Retrieve API(仅检索)
Returns raw retrieved chunks without generation.
Use Cases:
- Custom generation logic
- Debugging retrieval quality
- Building custom RAG pipelines
- Integrating with non-Bedrock models
python
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={
'text': 'What are the benefits of hierarchical chunking?'
},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID', # SEMANTIC, HYBRID
'filter': {
'andAll': [
{
'equals': {
'key': 'document_type',
'value': 'technical_guide'
}
},
{
'greaterThan': {
'key': 'publish_year',
'value': 2024
}
}
]
}
}
}
)返回原始检索分块,不包含生成内容。
适用场景:
- 自定义生成逻辑
- 调试检索质量
- 构建自定义RAG流水线
- 与非Bedrock模型集成
python
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={
'text': 'What are the benefits of hierarchical chunking?'
},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID', # SEMANTIC, HYBRID
'filter': {
'andAll': [
{
'equals': {
'key': 'document_type',
'value': 'technical_guide'
}
},
{
'greaterThan': {
'key': 'publish_year',
'value': 2024
}
}
]
}
}
}
)Process retrieved chunks
Process retrieved chunks
for result in response['retrievalResults']:
print(f"Score: {result['score']}")
print(f"Content: {result['content']['text']}")
print(f"Location: {result['location']}")
print(f"Metadata: {result.get('metadata', {})}")
print("---")
undefinedfor result in response['retrievalResults']:
print(f"Score: {result['score']}")
print(f"Content: {result['content']['text']}")
print(f"Location: {result['location']}")
print(f"Metadata: {result.get('metadata', {})}")
print("---")
undefined2. Retrieve and Generate API (RAG)
2. Retrieve and Generate API(RAG)
Returns generated response with source attribution.
Use Cases:
- Complete RAG workflows
- Question answering
- Document summarization
- Chatbots with knowledge bases
python
response = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'Explain semantic chunking benefits and when to use it'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID'
}
},
'generationConfiguration': {
'inferenceConfig': {
'textInferenceConfig': {
'temperature': 0.7,
'maxTokens': 2048,
'topP': 0.9
}
},
'promptTemplate': {
'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.
Context: $search_results$
Question: $query$
Answer:'''
}
}
}
}
)
print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
for reference in citation['retrievedReferences']:
print(f" - {reference['location']}")
print(f" Relevance Score: {reference.get('score', 'N/A')}")返回带来源归因的生成响应。
适用场景:
- 完整RAG工作流
- 问答场景
- 文档摘要
- 集成知识库的聊天机器人
python
response = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'Explain semantic chunking benefits and when to use it'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': 5,
'overrideSearchType': 'HYBRID'
}
},
'generationConfiguration': {
'inferenceConfig': {
'textInferenceConfig': {
'temperature': 0.7,
'maxTokens': 2048,
'topP': 0.9
}
},
'promptTemplate': {
'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.
Context: $search_results$
Question: $query$
Answer:'''
}
}
}
}
)
print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
for reference in citation['retrievedReferences']:
print(f" - {reference['location']}")
print(f" Relevance Score: {reference.get('score', 'N/A')}")3. Multi-Turn Conversations with Session Management
3. 带会话管理的多轮对话
Bedrock automatically manages conversation context across turns.
python
undefinedBedrock会自动管理多轮对话的上下文。
python
undefinedFirst turn - creates session automatically
第一轮 - 自动创建会话
response1 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What is Amazon Bedrock Knowledge Bases?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
}
)
session_id = response1['sessionId']
print(f"Session ID: {session_id}")
print(f"Response: {response1['output']['text']}\n")
response1 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What is Amazon Bedrock Knowledge Bases?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
}
)
session_id = response1['sessionId']
print(f"Session ID: {session_id}")
print(f"Response: {response1['output']['text']}\n")
Follow-up turn - reuse session for context
后续轮次 - 复用会话以保留上下文
response2 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What chunking strategies does it support?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
},
sessionId=session_id # Continue conversation with context
)
print(f"Follow-up Response: {response2['output']['text']}")
response2 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'What chunking strategies does it support?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
},
sessionId=session_id # 复用会话上下文
)
print(f"Follow-up Response: {response2['output']['text']}")
Third turn
第三轮
response3 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'Which strategy would you recommend for legal documents?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
},
sessionId=session_id
)
print(f"Third Response: {response3['output']['text']}")
undefinedresponse3 = bedrock_agent_runtime.retrieve_and_generate(
input={
'text': 'Which strategy would you recommend for legal documents?'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'KB123456',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
}
},
sessionId=session_id
)
print(f"Third Response: {response3['output']['text']}")
undefined4. Advanced Metadata Filtering
4. 高级元数据过滤
Filter retrieval by metadata attributes for precision.
python
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={
'text': 'Security best practices for production deployments'
},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': 10,
'overrideSearchType': 'HYBRID',
'filter': {
'andAll': [
{
'equals': {
'key': 'document_type',
'value': 'security_guide'
}
},
{
'greaterThanOrEquals': {
'key': 'publish_year',
'value': 2024
}
},
{
'in': {
'key': 'category',
'value': ['production', 'security', 'compliance']
}
}
]
}
}
}
)Supported Filter Operators:
- : Exact match
equals - : Not equal
notEquals - ,
greaterThan: Numeric comparisongreaterThanOrEquals - ,
lessThan: Numeric comparisonlessThanOrEquals - : Match any value in array
in - : Not match any value in array
notIn - : String prefix match
startsWith - : Combine filters with AND
andAll - : Combine filters with OR
orAll
通过元数据属性过滤检索结果以提升精准度。
python
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={
'text': 'Security best practices for production deployments'
},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': 10,
'overrideSearchType': 'HYBRID',
'filter': {
'andAll': [
{
'equals': {
'key': 'document_type',
'value': 'security_guide'
}
},
{
'greaterThanOrEquals': {
'key': 'publish_year',
'value': 2024
}
},
{
'in': {
'key': 'category',
'value': ['production', 'security', 'compliance']
}
}
]
}
}
}
)支持的过滤操作符:
- : 精确匹配
equals - : 不匹配
notEquals - ,
greaterThan: 数值比较greaterThanOrEquals - ,
lessThan: 数值比较lessThanOrEquals - : 匹配数组中的任意值
in - : 不匹配数组中的任意值
notIn - : 字符串前缀匹配
startsWith - : 用AND组合多个过滤条件
andAll - : 用OR组合多个过滤条件
orAll
Ingestion Management
导入管理
1. Start Ingestion Job
1. 启动导入任务
python
ingestion_response = bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Monthly document sync',
clientToken='unique-idempotency-token-123'
)
job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")python
ingestion_response = bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Monthly document sync',
clientToken='unique-idempotency-token-123'
)
job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")2. Monitor Ingestion Job
2. 监控导入任务
python
undefinedpython
undefinedGet job status
获取任务状态
job_status = bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
print(f"Status: {job_status['ingestionJob']['status']}")
print(f"Started: {job_status['ingestionJob']['startedAt']}")
print(f"Updated: {job_status['ingestionJob']['updatedAt']}")
if 'statistics' in job_status['ingestionJob']:
stats = job_status['ingestionJob']['statistics']
print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}")
print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}")
print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")
job_status = bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
print(f"Status: {job_status['ingestionJob']['status']}")
print(f"Started: {job_status['ingestionJob']['startedAt']}")
print(f"Updated: {job_status['ingestionJob']['updatedAt']}")
if 'statistics' in job_status['ingestionJob']:
stats = job_status['ingestionJob']['statistics']
print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}")
print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}")
print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")
Wait for completion
等待任务完成
import time
while True:
status = bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
current_status = status['ingestionJob']['status']
if current_status in ['COMPLETE', 'FAILED']:
print(f"Ingestion job {current_status}")
break
print(f"Status: {current_status}, waiting...")
time.sleep(30)undefinedimport time
while True:
status = bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
current_status = status['ingestionJob']['status']
if current_status in ['COMPLETE', 'FAILED']:
print(f"Ingestion job {current_status}")
break
print(f"Status: {current_status}, waiting...")
time.sleep(30)undefined3. List Ingestion Jobs
3. 列出导入任务
python
list_response = bedrock_agent.list_ingestion_jobs(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
maxResults=50
)
for job in list_response['ingestionJobSummaries']:
print(f"Job ID: {job['ingestionJobId']}")
print(f"Status: {job['status']}")
print(f"Started: {job['startedAt']}")
print(f"Updated: {job['updatedAt']}")
print("---")python
list_response = bedrock_agent.list_ingestion_jobs(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
maxResults=50
)
for job in list_response['ingestionJobSummaries']:
print(f"Job ID: {job['ingestionJobId']}")
print(f"Status: {job['status']}")
print(f"Started: {job['startedAt']}")
print(f"Updated: {job['updatedAt']}")
print("---")Integration with Bedrock Agents
与Bedrock Agent集成
1. Agent with Knowledge Base Action
1. 集成知识库的Agent
python
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')python
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')Create agent with knowledge base
创建集成知识库的Agent
agent_response = bedrock_agent.create_agent(
agentName='customer-support-agent',
description='Customer support agent with knowledge base access',
instruction='''You are a customer support agent. When answering questions:
- Search the knowledge base for relevant information
- Provide accurate answers based on retrieved context
- Cite your sources
- Admit when you don't know something''', foundationModel='anthropic.claude-3-sonnet-20240229-v1:0', agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole' )
agent_id = agent_response['agent']['agentId']
agent_response = bedrock_agent.create_agent(
agentName='customer-support-agent',
description='Customer support agent with knowledge base access',
instruction='''You are a customer support agent. When answering questions:
- Search the knowledge base for relevant information
- Provide accurate answers based on retrieved context
- Cite your sources
- Admit when you don't know something''', foundationModel='anthropic.claude-3-sonnet-20240229-v1:0', agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole' )
agent_id = agent_response['agent']['agentId']
Associate knowledge base with agent
关联知识库与Agent
kb_association = bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB123456',
description='Company documentation knowledge base',
knowledgeBaseState='ENABLED'
)
kb_association = bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB123456',
description='Company documentation knowledge base',
knowledgeBaseState='ENABLED'
)
Prepare and create alias
准备并创建别名
bedrock_agent.prepare_agent(agentId=agent_id)
alias_response = bedrock_agent.create_agent_alias(
agentId=agent_id,
agentAliasName='production',
description='Production alias'
)
agent_alias_id = alias_response['agentAlias']['agentAliasId']
bedrock_agent.prepare_agent(agentId=agent_id)
alias_response = bedrock_agent.create_agent_alias(
agentId=agent_id,
agentAliasName='production',
description='Production alias'
)
agent_alias_id = alias_response['agentAlias']['agentAliasId']
Invoke agent (automatically queries knowledge base)
调用Agent(自动查询知识库)
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.invoke_agent(
agentId=agent_id,
agentAliasId=agent_alias_id,
sessionId='session-123',
inputText='What is our return policy for defective products?'
)
for event in response['completion']:
if 'chunk' in event:
chunk = event['chunk']
print(chunk['bytes'].decode())
undefinedbedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.invoke_agent(
agentId=agent_id,
agentAliasId=agent_alias_id,
sessionId='session-123',
inputText='What is our return policy for defective products?'
)
for event in response['completion']:
if 'chunk' in event:
chunk = event['chunk']
print(chunk['bytes'].decode())
undefined2. Agent with Multiple Knowledge Bases
2. 集成多知识库的Agent
python
undefinedpython
undefinedAssociate multiple knowledge bases
关联多个知识库
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-PRODUCT-DOCS',
description='Product documentation'
)
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-SUPPORT-ARTICLES',
description='Support knowledge articles'
)
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-COMPANY-POLICIES',
description='Company policies and procedures'
)
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-PRODUCT-DOCS',
description='Product documentation'
)
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-SUPPORT-ARTICLES',
description='Support knowledge articles'
)
bedrock_agent.associate_agent_knowledge_base(
agentId=agent_id,
agentVersion='DRAFT',
knowledgeBaseId='KB-COMPANY-POLICIES',
description='Company policies and procedures'
)
Agent automatically searches all knowledge bases and combines results
Agent会自动搜索所有知识库并合并结果
---
---Best Practices
最佳实践
1. Chunking Strategy Selection
1. 分块策略选择
Decision Framework:
-
Simple, uniform documents → Fixed-size chunking
- Blog posts, articles, simple FAQs
- Fast, predictable, cost-effective
-
Documents without clear boundaries → Semantic chunking
- Legal documents, contracts, academic papers
- Preserves semantic meaning, better accuracy
- Consider additional cost
-
Nested, hierarchical documents → Hierarchical chunking
- Technical manuals, product docs, research papers
- Best balance of precision and context
- Optimal for complex structures
-
Specialized formats → Custom Lambda chunking
- Medical records (HL7, FHIR), code docs, custom formats
- Complete control, domain expertise
- Higher operational complexity
Tuning Guidelines:
- Fixed-size: Start with 512 tokens, 20% overlap
- Semantic: Start with 300 tokens, bufferSize=1, threshold=95%
- Hierarchical: Parent 1500 tokens, child 300 tokens, overlap 60 tokens
- Custom: Test extensively with domain experts
决策框架:
-
简单、统一结构文档 → 固定大小分块
- 博客文章、普通文章、简单FAQ
- 快速、可预测、成本低
-
无明确边界的文档 → 语义分块
- 法律文档、合同、学术论文
- 保留语义含义,准确性更高
- 需考虑额外成本
-
嵌套、分层结构文档 → 分层分块
- 技术手册、产品文档、研究论文
- 精准度与上下文的最优平衡
- 适用于复杂结构
-
专业格式文档 → 自定义Lambda分块
- 医疗记录(HL7、FHIR)、代码文档、自定义格式
- 完全可控,集成领域知识
- 运维复杂度更高
调优指南:
- 固定大小分块:从512 token、20%重叠开始
- 语义分块:从300 token、bufferSize=1、阈值95%开始
- 分层分块:父分块1500 token、子分块300 token、重叠60 token
- 自定义分块:与领域专家协作进行充分测试
2. Retrieval Optimization
2. 检索优化
Number of Results:
- Start with 5-10 results
- Increase if answers lack detail
- Decrease if too much noise
Search Type:
- SEMANTIC: Pure vector similarity (faster, good for conceptual queries)
- HYBRID: Vector + keyword (better recall, recommended for production)
Use Hybrid Search when:
- Queries contain specific terms or names
- Need to match exact keywords
- Domain has specialized vocabulary
Use Semantic Search when:
- Purely conceptual queries
- Prioritizing speed over perfect recall
- Well-embedded domain knowledge
Metadata Filters:
- Always use when applicable
- Dramatically improves precision
- Reduces retrieval latency
- Examples: document_type, publish_date, category, author
结果数量:
- 从5-10个结果开始
- 若答案缺乏细节则增加数量
- 若结果噪音过多则减少数量
搜索类型:
- SEMANTIC:纯向量相似度(更快,适用于概念类查询)
- HYBRID:向量+关键词(召回率更高,生产环境推荐)
使用混合搜索的场景:
- 查询包含特定术语或名称
- 需要匹配精确关键词
- 领域含专业词汇
使用语义搜索的场景:
- 纯概念类查询
- 优先考虑速度而非完美召回率
- 领域知识已充分嵌入
元数据过滤:
- 适用时务必使用
- 大幅提升精准度
- 降低检索延迟
- 示例:document_type、publish_date、category、author
3. Cost Optimization
3. 成本优化
S3 Vectors:
- Use for large-scale knowledge bases (millions of chunks)
- Up to 90% cost savings vs. OpenSearch
- Ideal for cost-sensitive applications
- Trade-off: Slightly higher latency
Semantic Chunking:
- Incurs foundation model costs during ingestion
- Consider cost vs. accuracy benefit
- May not be worth it for simple documents
- Best for complex, high-value content
Ingestion Frequency:
- Schedule ingestion during off-peak hours
- Use incremental updates when possible
- Don't re-ingest unchanged documents
Model Selection:
- Use smaller embedding models when accuracy permits
- Titan Embed Text v2 is cost-effective
- Consider Cohere Embed for multilingual
Token Usage:
- Monitor generation token usage
- Set appropriate maxTokens limits
- Use prompt templates to control verbosity
S3 Vectors:
- 适用于大规模知识库(数百万分块)
- 相比OpenSearch节省高达90%的成本
- 适用于对成本敏感的应用
- 权衡点:延迟略高
语义分块:
- 导入阶段会产生基础模型费用
- 需权衡成本与准确性收益
- 简单文档可能不值得额外成本
- 适用于复杂、高价值内容
导入频率:
- 在非高峰时段调度导入任务
- 尽可能使用增量更新
- 不要重新导入未修改的文档
模型选择:
- 在准确性允许的情况下使用更小的嵌入模型
- Titan Embed Text v2性价比高
- 多语言场景可考虑Cohere Embed
Token使用:
- 监控生成token的使用量
- 设置合理的maxTokens限制
- 使用提示模板控制输出冗长程度
4. Session Management
4. 会话管理
Always Reuse Sessions:
- Pass for follow-up turns
sessionId - Bedrock handles context automatically
- No manual conversation history needed
Session Lifecycle:
- Sessions expire after inactivity (default: 60 minutes)
- Create new session for unrelated conversations
- Use unique sessionId per user/conversation
Context Limits:
- Monitor conversation length
- Long sessions may hit context limits
- Consider summarization for very long conversations
务必复用会话:
- 后续轮次传递
sessionId - Bedrock自动处理上下文
- 无需手动维护对话历史
会话生命周期:
- 会话在闲置后过期(默认:60分钟)
- 无关对话创建新会话
- 为每个用户/对话使用唯一的sessionId
上下文限制:
- 监控对话长度
- 长会话可能触发上下文限制
- 超长对话可考虑摘要处理
5. GraphRAG with Neptune
5. 基于Neptune的GraphRAG
When to Use:
- Interconnected knowledge domains
- Relationship-aware queries
- Need for explainability
- Complex knowledge graphs
Benefits:
- Automatic graph creation
- Improved accuracy through relationships
- Comprehensive answers
- Explainable results
Considerations:
- Higher setup complexity
- Neptune Analytics costs
- Best for domains with rich relationships
适用场景:
- 互联知识领域
- 关系感知类查询
- 需要可解释性
- 复杂知识图谱
优势:
- 自动创建图谱
- 通过关系提升准确性
- 生成全面回答
- 结果可解释
注意事项:
- 配置复杂度更高
- Neptune Analytics有额外成本
- 适用于关系丰富的领域
6. Data Source Management
6. 数据源管理
S3 Best Practices:
- Organize with clear prefixes
- Use inclusion/exclusion filters
- Maintain consistent metadata
- Version documents when updating
Web Crawler:
- Set appropriate rate limits
- Use robots.txt for guidance
- Monitor for broken links
- Schedule regular re-crawls
Confluence/SharePoint:
- Filter by spaces/sites
- Exclude archived content
- Use fine-grained permissions
- Schedule incremental syncs
Metadata Enrichment:
- Add custom metadata to documents
- Include: document_type, publish_date, category, author, version
- Enables powerful filtering
- Improves retrieval precision
S3最佳实践:
- 用清晰的前缀组织内容
- 使用包含/排除过滤
- 保持元数据一致性
- 更新文档时使用版本控制
网页爬虫:
- 设置合理的速率限制
- 参考robots.txt
- 监控失效链接
- 定期调度重新抓取
Confluence/SharePoint:
- 按空间/站点过滤
- 排除归档内容
- 使用细粒度权限
- 调度增量同步
元数据增强:
- 为文档添加自定义元数据
- 包含:document_type、publish_date、category、author、version
- 实现强大的过滤能力
- 提升检索精准度
7. Monitoring and Debugging
7. 监控与调试
Enable CloudWatch Logs:
python
undefined启用CloudWatch日志:
python
undefinedMonitor retrieval quality
监控检索质量
Track: query latency, retrieval scores, generation quality
跟踪:查询延迟、检索分数、生成质量
Set alarms for: high latency, low scores, high error rates
设置告警:高延迟、低分数、高错误率
**Test Retrieval Quality**:
```python
**测试检索质量**:
```pythonUse retrieve API to debug
使用retrieve API调试
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={'text': 'test query'}
)
response = bedrock_agent_runtime.retrieve(
knowledgeBaseId='KB123456',
retrievalQuery={'text': 'test query'}
)
Analyze retrieval scores
分析检索分数
for result in response['retrievalResults']:
print(f"Score: {result['score']}")
print(f"Content preview: {result['content']['text'][:200]}")
**Common Issues**:
1. **Low Retrieval Scores**:
- Check chunking strategy
- Verify embedding model
- Ensure documents are properly ingested
- Consider semantic or hierarchical chunking
2. **Irrelevant Results**:
- Add metadata filters
- Use hybrid search
- Refine chunking strategy
- Increase numberOfResults
3. **Missing Information**:
- Verify data source configuration
- Check ingestion job status
- Ensure documents are not excluded by filters
- Increase numberOfResults
4. **Slow Retrieval**:
- Use metadata filters to narrow scope
- Optimize vector database configuration
- Consider S3 Vectors for cost over latency
- Reduce numberOfResultsfor result in response['retrievalResults']:
print(f"Score: {result['score']}")
print(f"Content preview: {result['content']['text'][:200]}")
**常见问题**:
1. **检索分数低**:
- 检查分块策略
- 验证嵌入模型
- 确保文档已正确导入
- 考虑使用语义或分层分块
2. **结果不相关**:
- 添加元数据过滤
- 使用混合搜索
- 优化分块策略
- 增加结果数量
3. **信息缺失**:
- 验证数据源配置
- 检查导入任务状态
- 确保文档未被过滤排除
- 增加结果数量
4. **检索缓慢**:
- 使用元数据过滤缩小范围
- 优化向量数据库配置
- 考虑用S3 Vectors平衡成本与延迟
- 减少结果数量8. Security Best Practices
8. 安全最佳实践
IAM Permissions:
- Use least privilege for Knowledge Base role
- Separate roles for data sources, ingestion, retrieval
- Enable VPC endpoints for private connectivity
Data Encryption:
- All data encrypted at rest (AWS KMS)
- Data encrypted in transit (TLS)
- Use customer-managed KMS keys for compliance
Access Control:
- Use IAM policies to control who can query
- Implement fine-grained access control
- Monitor access with CloudTrail
PII Handling:
- Use Bedrock Guardrails for PII redaction
- Implement data masking for sensitive fields
- Consider custom Lambda for advanced PII handling
IAM权限:
- 为知识库角色使用最小权限原则
- 为数据源、导入、检索分别设置独立角色
- 启用VPC端点实现私有连接
数据加密:
- 所有数据静态加密(AWS KMS)
- 传输中数据加密(TLS)
- 为合规性使用客户管理的KMS密钥
访问控制:
- 使用IAM策略控制查询权限
- 实现细粒度访问控制
- 用CloudTrail监控访问
PII处理:
- 使用Bedrock Guardrails进行PII脱敏
- 对敏感字段实现数据掩码
- 考虑用自定义Lambda实现高级PII处理
Complete Production Example
完整生产示例
End-to-End RAG Application
端到端RAG应用
python
import boto3
import json
from typing import List, Dict, Optional
class BedrockKnowledgeBaseRAG:
"""Production RAG application with Amazon Bedrock Knowledge Bases"""
def __init__(self, region_name: str = 'us-east-1'):
self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)
def create_knowledge_base(
self,
name: str,
description: str,
role_arn: str,
vector_store_config: Dict,
embedding_model: str = 'amazon.titan-embed-text-v2:0'
) -> str:
"""Create knowledge base with vector store"""
response = self.bedrock_agent.create_knowledge_base(
name=name,
description=description,
roleArn=role_arn,
knowledgeBaseConfiguration={
'type': 'VECTOR',
'vectorKnowledgeBaseConfiguration': {
'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
}
},
storageConfiguration=vector_store_config
)
return response['knowledgeBase']['knowledgeBaseId']
def add_s3_data_source(
self,
knowledge_base_id: str,
name: str,
bucket_arn: str,
inclusion_prefixes: List[str],
chunking_strategy: str = 'FIXED_SIZE',
chunking_config: Optional[Dict] = None
) -> str:
"""Add S3 data source with chunking configuration"""
if chunking_config is None:
chunking_config = {
'maxTokens': 512,
'overlapPercentage': 20
}
vector_ingestion_config = {
'chunkingConfiguration': {
'chunkingStrategy': chunking_strategy
}
}
if chunking_strategy == 'FIXED_SIZE':
vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
elif chunking_strategy == 'SEMANTIC':
vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
elif chunking_strategy == 'HIERARCHICAL':
vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config
response = self.bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name=name,
description=f'S3 data source: {name}',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': bucket_arn,
'inclusionPrefixes': inclusion_prefixes
}
},
vectorIngestionConfiguration=vector_ingestion_config
)
return response['dataSource']['dataSourceId']
def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
"""Start ingestion job and wait for completion"""
import time
# Start ingestion
response = self.bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Automated ingestion'
)
job_id = response['ingestionJob']['ingestionJobId']
# Wait for completion
while True:
status_response = self.bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
status = status_response['ingestionJob']['status']
if status == 'COMPLETE':
print(f"Ingestion completed successfully")
if 'statistics' in status_response['ingestionJob']:
stats = status_response['ingestionJob']['statistics']
print(f"Documents indexed: {stats.get('numberOfDocumentsIndexed', 0)}")
break
elif status == 'FAILED':
print(f"Ingestion failed")
break
print(f"Ingestion status: {status}")
time.sleep(30)
return job_id
def query(
self,
knowledge_base_id: str,
query: str,
model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
num_results: int = 5,
search_type: str = 'HYBRID',
metadata_filter: Optional[Dict] = None,
session_id: Optional[str] = None
) -> Dict:
"""Query knowledge base with retrieve and generate"""
retrieval_config = {
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': knowledge_base_id,
'modelArn': model_arn,
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': num_results,
'overrideSearchType': search_type
}
},
'generationConfiguration': {
'inferenceConfig': {
'textInferenceConfig': {
'temperature': 0.7,
'maxTokens': 2048
}
}
}
}
}
# Add metadata filter if provided
if metadata_filter:
retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter
# Build request
request = {
'input': {'text': query},
'retrieveAndGenerateConfiguration': retrieval_config
}
# Add session if provided
if session_id:
request['sessionId'] = session_id
response = self.bedrock_agent_runtime.retrieve_and_generate(**request)
return {
'answer': response['output']['text'],
'citations': response.get('citations', []),
'session_id': response['sessionId']
}
def multi_turn_conversation(
self,
knowledge_base_id: str,
queries: List[str],
model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
) -> List[Dict]:
"""Execute multi-turn conversation with context"""
session_id = None
conversation = []
for query in queries:
result = self.query(
knowledge_base_id=knowledge_base_id,
query=query,
model_arn=model_arn,
session_id=session_id
)
session_id = result['session_id']
conversation.append({
'query': query,
'answer': result['answer'],
'citations': result['citations']
})
return conversationpython
import boto3
import json
from typing import List, Dict, Optional
class BedrockKnowledgeBaseRAG:
"""基于Amazon Bedrock知识库的生产级RAG应用"""
def __init__(self, region_name: str = 'us-east-1'):
self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)
def create_knowledge_base(
self,
name: str,
description: str,
role_arn: str,
vector_store_config: Dict,
embedding_model: str = 'amazon.titan-embed-text-v2:0'
) -> str:
"""创建带向量存储的知识库"""
response = self.bedrock_agent.create_knowledge_base(
name=name,
description=description,
roleArn=role_arn,
knowledgeBaseConfiguration={
'type': 'VECTOR',
'vectorKnowledgeBaseConfiguration': {
'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
}
},
storageConfiguration=vector_store_config
)
return response['knowledgeBase']['knowledgeBaseId']
def add_s3_data_source(
self,
knowledge_base_id: str,
name: str,
bucket_arn: str,
inclusion_prefixes: List[str],
chunking_strategy: str = 'FIXED_SIZE',
chunking_config: Optional[Dict] = None
) -> str:
"""添加带分块配置的S3数据源"""
if chunking_config is None:
chunking_config = {
'maxTokens': 512,
'overlapPercentage': 20
}
vector_ingestion_config = {
'chunkingConfiguration': {
'chunkingStrategy': chunking_strategy
}
}
if chunking_strategy == 'FIXED_SIZE':
vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
elif chunking_strategy == 'SEMANTIC':
vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
elif chunking_strategy == 'HIERARCHICAL':
vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config
response = self.bedrock_agent.create_data_source(
knowledgeBaseId=knowledge_base_id,
name=name,
description=f'S3 data source: {name}',
dataSourceConfiguration={
'type': 'S3',
's3Configuration': {
'bucketArn': bucket_arn,
'inclusionPrefixes': inclusion_prefixes
}
},
vectorIngestionConfiguration=vector_ingestion_config
)
return response['dataSource']['dataSourceId']
def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
"""启动导入任务并等待完成"""
import time
# 启动导入
response = self.bedrock_agent.start_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
description='Automated ingestion'
)
job_id = response['ingestionJob']['ingestionJobId']
# 等待完成
while True:
status_response = self.bedrock_agent.get_ingestion_job(
knowledgeBaseId=knowledge_base_id,
dataSourceId=data_source_id,
ingestionJobId=job_id
)
status = status_response['ingestionJob']['status']
if status == 'COMPLETE':
print(f"导入任务成功完成")
if 'statistics' in status_response['ingestionJob']:
stats = status_response['ingestionJob']['statistics']
print(f"已扫描文档数: {stats.get('numberOfDocumentsScanned', 0)}")
break
elif status == 'FAILED':
print(f"导入任务失败")
break
print(f"导入任务状态: {status},等待中...")
time.sleep(30)
return job_id
def query(
self,
knowledge_base_id: str,
query: str,
model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
num_results: int = 5,
search_type: str = 'HYBRID',
metadata_filter: Optional[Dict] = None,
session_id: Optional[str] = None
) -> Dict:
"""通过检索与生成API查询知识库"""
retrieval_config = {
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': knowledge_base_id,
'modelArn': model_arn,
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': num_results,
'overrideSearchType': search_type
}
},
'generationConfiguration': {
'inferenceConfig': {
'textInferenceConfig': {
'temperature': 0.7,
'maxTokens': 2048
}
}
}
}
}
# 添加元数据过滤(如果提供)
if metadata_filter:
retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter
# 构建请求
request = {
'input': {'text': query},
'retrieveAndGenerateConfiguration': retrieval_config
}
# 添加会话(如果提供)
if session_id:
request['sessionId'] = session_id
response = self.bedrock_agent_runtime.retrieve_and_generate(**request)
return {
'answer': response['output']['text'],
'citations': response.get('citations', []),
'session_id': response['sessionId']
}
def multi_turn_conversation(
self,
knowledge_base_id: str,
queries: List[str],
model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
) -> List[Dict]:
"""执行带上下文的多轮对话"""
session_id = None
conversation = []
for query in queries:
result = self.query(
knowledge_base_id=knowledge_base_id,
query=query,
model_arn=model_arn,
session_id=session_id
)
session_id = result['session_id']
conversation.append({
'query': query,
'answer': result['answer'],
'citations': result['citations']
})
return conversationExample Usage
示例用法
if name == 'main':
rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')
# Create knowledge base
kb_id = rag.create_knowledge_base(
name='production-docs-kb',
description='Production documentation knowledge base',
role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
vector_store_config={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}
)
# Add data source
ds_id = rag.add_s3_data_source(
knowledge_base_id=kb_id,
name='technical-docs',
bucket_arn='arn:aws:s3:::my-docs-bucket',
inclusion_prefixes=['docs/'],
chunking_strategy='HIERARCHICAL',
chunking_config={
'levelConfigurations': [
{'maxTokens': 1500},
{'maxTokens': 300}
],
'overlapTokens': 60
}
)
# Ingest data
rag.ingest_data(kb_id, ds_id)
# Single query
result = rag.query(
knowledge_base_id=kb_id,
query='What are the best practices for RAG applications?',
metadata_filter={
'equals': {
'key': 'document_type',
'value': 'best_practices'
}
}
)
print(f"Answer: {result['answer']}")
print(f"\nSources:")
for citation in result['citations']:
for ref in citation['retrievedReferences']:
print(f" - {ref['location']}")
# Multi-turn conversation
conversation = rag.multi_turn_conversation(
knowledge_base_id=kb_id,
queries=[
'What is hierarchical chunking?',
'When should I use it?',
'What are the configuration parameters?'
]
)
for turn in conversation:
print(f"\nQ: {turn['query']}")
print(f"A: {turn['answer']}")
---if name == 'main':
rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')
# 创建知识库
kb_id = rag.create_knowledge_base(
name='production-docs-kb',
description='生产文档知识库',
role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
vector_store_config={
'type': 'OPENSEARCH_SERVERLESS',
'opensearchServerlessConfiguration': {
'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
'vectorIndexName': 'bedrock-kb-index',
'fieldMapping': {
'vectorField': 'bedrock-knowledge-base-default-vector',
'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
'metadataField': 'AMAZON_BEDROCK_METADATA'
}
}
}
)
# 添加数据源
ds_id = rag.add_s3_data_source(
knowledge_base_id=kb_id,
name='technical-docs',
bucket_arn='arn:aws:s3:::my-docs-bucket',
inclusion_prefixes=['docs/'],
chunking_strategy='HIERARCHICAL',
chunking_config={
'levelConfigurations': [
{'maxTokens': 1500},
{'maxTokens': 300}
],
'overlapTokens': 60
}
)
# 导入数据
rag.ingest_data(kb_id, ds_id)
# 单查询
result = rag.query(
knowledge_base_id=kb_id,
query='RAG应用的最佳实践有哪些?',
metadata_filter={
'equals': {
'key': 'document_type',
'value': 'best_practices'
}
}
)
print(f"答案: {result['answer']}")
print(f"\n来源:")
for citation in result['citations']:
for ref in citation['retrievedReferences']:
print(f" - {ref['location']}")
# 多轮对话
conversation = rag.multi_turn_conversation(
knowledge_base_id=kb_id,
queries=[
'什么是分层分块?',
'什么时候应该使用它?',
'配置参数有哪些?'
]
)
for turn in conversation:
print(f"\n问: {turn['query']}")
print(f"答: {turn['answer']}")
---Related Skills
相关能力
Amazon Bedrock Core Skills
Amazon Bedrock核心能力
- bedrock-guardrails: Content safety, PII redaction, hallucination detection
- bedrock-agents: Agentic workflows with tool use and knowledge bases
- bedrock-flows: Visual workflow builder for generative AI
- bedrock-model-customization: Fine-tuning, reinforcement fine-tuning, distillation
- bedrock-prompt-management: Prompt versioning and deployment
- bedrock-guardrails:内容安全、PII脱敏、幻觉检测
- bedrock-agents:集成工具与知识库的Agent工作流
- bedrock-flows:生成式AI可视化工作流构建器
- bedrock-model-customization:微调、强化微调、模型蒸馏
- bedrock-prompt-management:提示词版本管理与部署
AWS Infrastructure Skills
AWS基础设施能力
- opensearch-serverless: Vector database configuration and management
- neptune-analytics: GraphRAG configuration and queries
- s3-management: S3 bucket configuration for data sources and vectors
- iam-bedrock: IAM roles and policies for Knowledge Bases
- opensearch-serverless:向量数据库配置与管理
- neptune-analytics:GraphRAG配置与查询
- s3-management:用于数据源与向量的S3桶配置
- iam-bedrock:知识库相关的IAM角色与策略
Observability Skills
可观测性能力
- cloudwatch-bedrock-monitoring: Monitor Knowledge Bases metrics and logs
- bedrock-cost-optimization: Track and optimize Knowledge Bases costs
- cloudwatch-bedrock-monitoring:监控知识库指标与日志
- bedrock-cost-optimization:跟踪与优化知识库成本
Additional Resources
额外资源
Official Documentation
官方文档
Best Practices
最佳实践
Research Document
研究文档
- - Section 2 (Complete Knowledge Bases research)
/mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md
- - 第2部分(完整知识库研究)
/mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md