bedrock-knowledge-bases

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Amazon Bedrock Knowledge Bases

Amazon Bedrock 知识库

Amazon Bedrock Knowledge Bases is a fully managed RAG (Retrieval-Augmented Generation) solution that handles data ingestion, embedding generation, vector storage, retrieval with reranking, source attribution, and session context management.
Amazon Bedrock 知识库是一款全托管的RAG(检索增强生成)解决方案,可处理数据导入、嵌入生成、向量存储、带重排序的检索、来源归因以及会话上下文管理等工作。

Overview

概述

What It Does

核心功能

Amazon Bedrock Knowledge Bases provides:
  • Data Ingestion: Automatically process documents from S3, web, Confluence, SharePoint, Salesforce
  • Embedding Generation: Convert text to vectors using foundation models
  • Vector Storage: Store embeddings in multiple vector database options
  • Retrieval: Semantic and hybrid search with metadata filtering
  • Generation: RAG workflows with source attribution
  • Session Management: Multi-turn conversations with context
  • Chunking Strategies: Fixed, semantic, hierarchical, and custom chunking
Amazon Bedrock 知识库提供以下能力:
  • 数据导入:自动处理来自S3、网页、Confluence、SharePoint、Salesforce的文档
  • 嵌入生成:使用基础模型将文本转换为向量
  • 向量存储:支持多种向量数据库选项存储嵌入向量
  • 检索能力:支持语义搜索与混合搜索,可结合元数据过滤
  • 生成能力:带来源归因的RAG工作流
  • 会话管理:支持带上下文的多轮对话
  • 分块策略:固定大小、语义、分层及自定义分块

When to Use This Skill

适用场景

Use this skill when you need to:
  • Build RAG applications for document Q&A
  • Implement semantic search over enterprise knowledge
  • Create chatbots with knowledge bases
  • Integrate retrieval with Bedrock Agents
  • Configure optimal chunking strategies
  • Query documents with source attribution
  • Manage multi-turn conversations with context
  • Optimize RAG performance and cost
在以下场景中可使用该能力:
  • 构建面向文档问答的RAG应用
  • 针对企业知识实现语义搜索
  • 创建集成知识库的聊天机器人
  • 将检索能力与Bedrock Agent集成
  • 配置最优分块策略
  • 带来源归因的文档查询
  • 管理带上下文的多轮对话
  • 优化RAG的性能与成本

Key Capabilities

关键特性

  1. Multiple Vector Store Options: OpenSearch, S3 Vectors, Neptune, Pinecone, MongoDB, Redis
  2. Flexible Data Sources: S3, web crawlers, Confluence, SharePoint, Salesforce
  3. Advanced Chunking: Fixed-size, semantic, hierarchical, custom Lambda
  4. Hybrid Search: Combine semantic (vector) and keyword search
  5. Session Management: Built-in conversation context tracking
  6. GraphRAG: Relationship-aware retrieval with Neptune Analytics
  7. Cost Optimization: S3 Vectors for up to 90% storage savings

  1. 多向量存储选项:OpenSearch、S3 Vectors、Neptune、Pinecone、MongoDB、Redis
  2. 灵活数据源:S3、网页爬虫、Confluence、SharePoint、Salesforce
  3. 高级分块:固定大小、语义、分层、自定义Lambda分块
  4. 混合搜索:结合语义(向量)与关键词搜索
  5. 会话管理:内置对话上下文跟踪
  6. GraphRAG:借助Neptune Analytics实现关系感知检索
  7. 成本优化:使用S3 Vectors可节省高达90%的存储成本

Quick Start

快速开始

Basic RAG Workflow

基础RAG工作流

python
import boto3
import json
python
import boto3
import json

Initialize clients

初始化客户端

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1') bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1') bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

1. Create Knowledge Base

1. 创建知识库

kb_response = bedrock_agent.create_knowledge_base( name='enterprise-docs-kb', description='Company documentation knowledge base', roleArn='arn:aws:iam::123456789012:role/BedrockKBRole', knowledgeBaseConfiguration={ 'type': 'VECTOR', 'vectorKnowledgeBaseConfiguration': { 'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0' } }, storageConfiguration={ 'type': 'OPENSEARCH_SERVERLESS', 'opensearchServerlessConfiguration': { 'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection', 'vectorIndexName': 'bedrock-knowledge-base-index', 'fieldMapping': { 'vectorField': 'bedrock-knowledge-base-default-vector', 'textField': 'AMAZON_BEDROCK_TEXT_CHUNK', 'metadataField': 'AMAZON_BEDROCK_METADATA' } } } )
knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId'] print(f"Knowledge Base ID: {knowledge_base_id}")
kb_response = bedrock_agent.create_knowledge_base( name='enterprise-docs-kb', description='Company documentation knowledge base', roleArn='arn:aws:iam::123456789012:role/BedrockKBRole', knowledgeBaseConfiguration={ 'type': 'VECTOR', 'vectorKnowledgeBaseConfiguration': { 'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0' } }, storageConfiguration={ 'type': 'OPENSEARCH_SERVERLESS', 'opensearchServerlessConfiguration': { 'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection', 'vectorIndexName': 'bedrock-knowledge-base-index', 'fieldMapping': { 'vectorField': 'bedrock-knowledge-base-default-vector', 'textField': 'AMAZON_BEDROCK_TEXT_CHUNK', 'metadataField': 'AMAZON_BEDROCK_METADATA' } } } )
knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId'] print(f"Knowledge Base ID: {knowledge_base_id}")

2. Add S3 Data Source

2. 添加S3数据源

ds_response = bedrock_agent.create_data_source( knowledgeBaseId=knowledge_base_id, name='s3-documents', description='Company documents from S3', dataSourceConfiguration={ 'type': 'S3', 's3Configuration': { 'bucketArn': 'arn:aws:s3:::my-docs-bucket', 'inclusionPrefixes': ['documents/'] } }, vectorIngestionConfiguration={ 'chunkingConfiguration': { 'chunkingStrategy': 'FIXED_SIZE', 'fixedSizeChunkingConfiguration': { 'maxTokens': 512, 'overlapPercentage': 20 } } } )
data_source_id = ds_response['dataSource']['dataSourceId']
ds_response = bedrock_agent.create_data_source( knowledgeBaseId=knowledge_base_id, name='s3-documents', description='Company documents from S3', dataSourceConfiguration={ 'type': 'S3', 's3Configuration': { 'bucketArn': 'arn:aws:s3:::my-docs-bucket', 'inclusionPrefixes': ['documents/'] } }, vectorIngestionConfiguration={ 'chunkingConfiguration': { 'chunkingStrategy': 'FIXED_SIZE', 'fixedSizeChunkingConfiguration': { 'maxTokens': 512, 'overlapPercentage': 20 } } } )
data_source_id = ds_response['dataSource']['dataSourceId']

3. Start Ingestion

3. 启动导入任务

ingestion_response = bedrock_agent.start_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, description='Initial document ingestion' )
print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")
ingestion_response = bedrock_agent.start_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, description='Initial document ingestion' )
print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")

4. Query with Retrieve and Generate

4. 通过检索与生成API查询

response = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What is our vacation policy?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': knowledge_base_id, 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0', 'retrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 5, 'overrideSearchType': 'HYBRID' } } } } )
print(f"Answer: {response['output']['text']}") print(f"\nSources:") for citation in response['citations']: for reference in citation['retrievedReferences']: print(f" - {reference['location']['s3Location']['uri']}")

---
response = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What is our vacation policy?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': knowledge_base_id, 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0', 'retrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 5, 'overrideSearchType': 'HYBRID' } } } } )
print(f"Answer: {response['output']['text']}") print(f"\nSources:") for citation in response['citations']: for reference in citation['retrievedReferences']: print(f" - {reference['location']['s3Location']['uri']}")

---

Vector Store Options

向量存储选项

1. Amazon OpenSearch Serverless

1. Amazon OpenSearch Serverless

Best for: Production RAG applications with auto-scaling requirements
Benefits:
  • Fully managed, serverless operation
  • Auto-scaling compute and storage
  • High availability with multi-AZ deployment
  • Fast query performance
Configuration:
python
storageConfiguration={
    'type': 'OPENSEARCH_SERVERLESS',
    'opensearchServerlessConfiguration': {
        'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
        'vectorIndexName': 'bedrock-knowledge-base-index',
        'fieldMapping': {
            'vectorField': 'bedrock-knowledge-base-default-vector',
            'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
            'metadataField': 'AMAZON_BEDROCK_METADATA'
        }
    }
}
最佳适用场景:具备自动扩缩容需求的生产级RAG应用
优势:
  • 全托管、无服务器运行
  • 计算与存储自动扩缩容
  • 多AZ部署,高可用性
  • 快速查询性能
配置示例:
python
storageConfiguration={
    'type': 'OPENSEARCH_SERVERLESS',
    'opensearchServerlessConfiguration': {
        'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
        'vectorIndexName': 'bedrock-knowledge-base-index',
        'fieldMapping': {
            'vectorField': 'bedrock-knowledge-base-default-vector',
            'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
            'metadataField': 'AMAZON_BEDROCK_METADATA'
        }
    }
}

2. Amazon S3 Vectors (Preview)

2. Amazon S3 Vectors(预览版)

Best for: Cost-optimized, large-scale RAG applications
Benefits:
  • Up to 90% cost reduction for vector storage
  • Built-in vector support in S3
  • Subsecond query performance
  • Massive scale and durability
Ideal Use Cases:
  • Large document collections (millions of chunks)
  • Cost-sensitive applications
  • Archival knowledge bases
  • Low-to-medium QPS workloads
Configuration:
python
storageConfiguration={
    'type': 'S3_VECTORS',
    's3VectorsConfiguration': {
        'bucketArn': 'arn:aws:s3:::my-vector-bucket',
        'prefix': 'vectors/'
    }
}
Limitations:
  • Still in preview (no CloudFormation/CDK support yet)
  • Not suitable for high QPS, millisecond-latency requirements
  • Best for cost optimization over ultra-low latency
最佳适用场景:成本优化型、大规模RAG应用
优势:
  • 向量存储成本降低高达90%
  • S3内置向量支持
  • 亚秒级查询性能
  • 海量扩展能力与持久性
理想场景:
  • 大型文档集合(数百万分块)
  • 对成本敏感的应用
  • 归档类知识库
  • 低至中等QPS的工作负载
配置示例:
python
storageConfiguration={
    'type': 'S3_VECTORS',
    's3VectorsConfiguration': {
        'bucketArn': 'arn:aws:s3:::my-vector-bucket',
        'prefix': 'vectors/'
    }
}
限制:
  • 仍处于预览阶段(暂不支持CloudFormation/CDK)
  • 不适用于高QPS、毫秒级延迟要求的场景
  • 优先满足成本优化而非超低延迟需求

3. Amazon Neptune Analytics (GraphRAG)

3. Amazon Neptune Analytics(GraphRAG)

Best for: Interconnected knowledge domains requiring relationship-aware retrieval
Benefits:
  • Automatic graph creation linking related content
  • Improved retrieval accuracy through relationships
  • Comprehensive responses leveraging knowledge graph
  • Explainable results with relationship context
Use Cases:
  • Legal document analysis with case precedents
  • Scientific research with paper citations
  • Product catalogs with dependencies
  • Organizational knowledge with team relationships
Configuration:
python
storageConfiguration={
    'type': 'NEPTUNE_ANALYTICS',
    'neptuneAnalyticsConfiguration': {
        'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
        'vectorSearchConfiguration': {
            'vectorField': 'embedding'
        }
    }
}
最佳适用场景:需要关系感知检索的互联知识领域
优势:
  • 自动创建关联相关内容的图谱
  • 通过关系提升检索准确性
  • 利用知识图谱生成全面响应
  • 带关系上下文的可解释结果
适用场景:
  • 带案例先例的法律文档分析
  • 带论文引用的科研场景
  • 带依赖关系的产品目录
  • 带团队关系的组织知识
配置示例:
python
storageConfiguration={
    'type': 'NEPTUNE_ANALYTICS',
    'neptuneAnalyticsConfiguration': {
        'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
        'vectorSearchConfiguration': {
            'vectorField': 'embedding'
        }
    }
}

4. Amazon OpenSearch Service Managed Cluster

4. Amazon OpenSearch Service 托管集群

Best for: Existing OpenSearch infrastructure, advanced customization
Configuration:
python
storageConfiguration={
    'type': 'OPENSEARCH_SERVICE',
    'opensearchServiceConfiguration': {
        'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
最佳适用场景:已有OpenSearch基础设施、需要高级定制的场景
配置示例:
python
storageConfiguration={
    'type': 'OPENSEARCH_SERVICE',
    'opensearchServiceConfiguration': {
        'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

5. Third-Party Vector Databases

5. 第三方向量数据库

Pinecone:
python
storageConfiguration={
    'type': 'PINECONE',
    'pineconeConfiguration': {
        'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
        'namespace': 'bedrock-kb',
        'fieldMapping': {
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
MongoDB Atlas:
python
storageConfiguration={
    'type': 'MONGODB_ATLAS',
    'mongoDbAtlasConfiguration': {
        'endpoint': 'https://cluster0.mongodb.net',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
        'databaseName': 'bedrock_kb',
        'collectionName': 'vectors',
        'vectorIndexName': 'vector_index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
Redis Enterprise Cloud:
python
storageConfiguration={
    'type': 'REDIS_ENTERPRISE_CLOUD',
    'redisEnterpriseCloudConfiguration': {
        'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Pinecone:
python
storageConfiguration={
    'type': 'PINECONE',
    'pineconeConfiguration': {
        'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
        'namespace': 'bedrock-kb',
        'fieldMapping': {
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
MongoDB Atlas:
python
storageConfiguration={
    'type': 'MONGODB_ATLAS',
    'mongoDbAtlasConfiguration': {
        'endpoint': 'https://cluster0.mongodb.net',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
        'databaseName': 'bedrock_kb',
        'collectionName': 'vectors',
        'vectorIndexName': 'vector_index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
Redis Enterprise Cloud:
python
storageConfiguration={
    'type': 'REDIS_ENTERPRISE_CLOUD',
    'redisEnterpriseCloudConfiguration': {
        'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Data Source Configuration

数据源配置

1. Amazon S3

1. Amazon S3

Supported File Types: PDF, TXT, MD, HTML, DOC, DOCX, CSV, XLS, XLSX
python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-technical-docs',
    description='Technical documentation from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
            'exclusionPrefixes': ['docs/archive/']
        }
    }
)
支持文件类型:PDF、TXT、MD、HTML、DOC、DOCX、CSV、XLS、XLSX
python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-technical-docs',
    description='Technical documentation from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
            'exclusionPrefixes': ['docs/archive/']
        }
    }
)

2. Web Crawler

2. 网页爬虫

Automatic website scraping and indexing:
python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='company-website',
    description='Public company website content',
    dataSourceConfiguration={
        'type': 'WEB',
        'webConfiguration': {
            'sourceConfiguration': {
                'urlConfiguration': {
                    'seedUrls': [
                        {'url': 'https://www.example.com/docs'},
                        {'url': 'https://www.example.com/blog'}
                    ]
                }
            },
            'crawlerConfiguration': {
                'crawlerLimits': {
                    'rateLimit': 300  # Pages per minute
                }
            }
        }
    }
)
自动网站抓取与索引:
python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='company-website',
    description='Public company website content',
    dataSourceConfiguration={
        'type': 'WEB',
        'webConfiguration': {
            'sourceConfiguration': {
                'urlConfiguration': {
                    'seedUrls': [
                        {'url': 'https://www.example.com/docs'},
                        {'url': 'https://www.example.com/blog'}
                    ]
                }
            },
            'crawlerConfiguration': {
                'crawlerLimits': {
                    'rateLimit': 300  # Pages per minute
                }
            }
        }
    }
)

3. Confluence

3. Confluence

python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='confluence-wiki',
    description='Company Confluence knowledge base',
    dataSourceConfiguration={
        'type': 'CONFLUENCE',
        'confluenceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.atlassian.net/wiki',
                'hostType': 'SAAS',
                'authType': 'BASIC',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Space',
                                'inclusionFilters': ['Engineering', 'Product'],
                                'exclusionFilters': ['Archive']
                            }
                        ]
                    }
                }
            }
        }
    }
)
python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='confluence-wiki',
    description='Company Confluence knowledge base',
    dataSourceConfiguration={
        'type': 'CONFLUENCE',
        'confluenceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.atlassian.net/wiki',
                'hostType': 'SAAS',
                'authType': 'BASIC',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Space',
                                'inclusionFilters': ['Engineering', 'Product'],
                                'exclusionFilters': ['Archive']
                            }
                        ]
                    }
                }
            }
        }
    }
)

4. SharePoint

4. SharePoint

python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='sharepoint-docs',
    description='SharePoint document library',
    dataSourceConfiguration={
        'type': 'SHAREPOINT',
        'sharePointConfiguration': {
            'sourceConfiguration': {
                'siteUrls': [
                    'https://company.sharepoint.com/sites/Engineering',
                    'https://company.sharepoint.com/sites/Product'
                ],
                'tenantId': 'tenant-id',
                'domain': 'company',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
            }
        }
    }
)
python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='sharepoint-docs',
    description='SharePoint document library',
    dataSourceConfiguration={
        'type': 'SHAREPOINT',
        'sharePointConfiguration': {
            'sourceConfiguration': {
                'siteUrls': [
                    'https://company.sharepoint.com/sites/Engineering',
                    'https://company.sharepoint.com/sites/Product'
                ],
                'tenantId': 'tenant-id',
                'domain': 'company',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
            }
        }
    }
)

5. Salesforce

5. Salesforce

python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='salesforce-knowledge',
    description='Salesforce knowledge articles',
    dataSourceConfiguration={
        'type': 'SALESFORCE',
        'salesforceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.my.salesforce.com',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Knowledge',
                                'inclusionFilters': ['Product_Documentation', 'Support_Articles']
                            }
                        ]
                    }
                }
            }
        }
    }
)

python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='salesforce-knowledge',
    description='Salesforce knowledge articles',
    dataSourceConfiguration={
        'type': 'SALESFORCE',
        'salesforceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.my.salesforce.com',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Knowledge',
                                'inclusionFilters': ['Product_Documentation', 'Support_Articles']
                            }
                        ]
                    }
                }
            }
        }
    }
)

Chunking Strategies

分块策略

1. Fixed-Size Chunking

1. 固定大小分块

Best for: Simple documents with uniform structure
How it works: Splits text into chunks of fixed token size with overlap
Parameters:
  • maxTokens
    : 200-8192 tokens (typically 512-1024)
  • overlapPercentage
    : 10-50% (typically 20%)
Configuration:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'FIXED_SIZE',
        'fixedSizeChunkingConfiguration': {
            'maxTokens': 512,
            'overlapPercentage': 20
        }
    }
}
Use Cases:
  • Blog posts and articles
  • Technical documentation with consistent formatting
  • FAQs and Q&A content
  • Simple text files
Pros:
  • Fast and predictable
  • No additional costs
  • Easy to tune
Cons:
  • May split semantic units awkwardly
  • Doesn't respect document structure
  • Can break context mid-sentence
最佳适用场景:结构统一的简单文档
工作原理:将文本按固定token大小拆分,设置重叠部分
参数:
  • maxTokens
    : 200-8192 token(通常为512-1024)
  • overlapPercentage
    : 10-50%(通常为20%)
配置示例:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'FIXED_SIZE',
        'fixedSizeChunkingConfiguration': {
            'maxTokens': 512,
            'overlapPercentage': 20
        }
    }
}
适用场景:
  • 博客文章
  • 格式一致的技术文档
  • FAQ与问答内容
  • 简单文本文件
优点:
  • 快速且可预测
  • 无额外成本
  • 易于调优
缺点:
  • 可能会生硬拆分语义单元
  • 不尊重文档结构
  • 可能在句子中间打断上下文

2. Semantic Chunking

2. 语义分块

Best for: Documents without clear boundaries (legal, technical, academic)
How it works: Uses sentence similarity to group related content
Parameters:
  • maxTokens
    : 20-8192 tokens (typically 300-500)
  • bufferSize
    : Number of neighboring sentences (default: 1)
  • breakpointPercentileThreshold
    : Similarity threshold (recommended: 95%)
Configuration:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'SEMANTIC',
        'semanticChunkingConfiguration': {
            'maxTokens': 300,
            'bufferSize': 1,
            'breakpointPercentileThreshold': 95
        }
    }
}
Use Cases:
  • Legal documents and contracts
  • Academic papers
  • Technical specifications
  • Medical records
  • Research reports
Pros:
  • Preserves semantic meaning
  • Better context preservation
  • Improved retrieval accuracy
Cons:
  • Additional cost (foundation model usage)
  • Slower ingestion
  • Less predictable chunk sizes
Cost Consideration: Semantic chunking uses foundation models for similarity analysis, incurring additional costs beyond storage and retrieval.
最佳适用场景:无明确边界的文档(法律、技术、学术类)
工作原理:使用句子相似度对相关内容分组
参数:
  • maxTokens
    : 20-8192 token(通常为300-500)
  • bufferSize
    : 相邻句子数量(默认值:1)
  • breakpointPercentileThreshold
    : 相似度阈值(推荐值:95%)
配置示例:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'SEMANTIC',
        'semanticChunkingConfiguration': {
            'maxTokens': 300,
            'bufferSize': 1,
            'breakpointPercentileThreshold': 95
        }
    }
}
适用场景:
  • 法律文档与合同
  • 学术论文
  • 技术规范
  • 医疗记录
  • 研究报告
优点:
  • 保留语义含义
  • 更好的上下文保留
  • 提升检索准确性
缺点:
  • 额外成本(基础模型使用费用)
  • 导入速度较慢
  • 分块大小较难预测
成本考量:语义分块使用基础模型进行相似度分析,除存储与检索成本外会产生额外费用。

3. Hierarchical Chunking

3. 分层分块

Best for: Complex documents with nested structure
How it works: Creates parent and child chunks; retrieves child, returns parent for context
Parameters:
  • levelConfigurations
    : Array of chunk sizes (parent → child)
  • overlapTokens
    : Overlap between chunks
Configuration:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'HIERARCHICAL',
        'hierarchicalChunkingConfiguration': {
            'levelConfigurations': [
                {
                    'maxTokens': 1500  # Parent chunk (comprehensive context)
                },
                {
                    'maxTokens': 300   # Child chunk (focused retrieval)
                }
            ],
            'overlapTokens': 60
        }
    }
}
Use Cases:
  • Technical manuals with sections and subsections
  • Academic papers with abstract, sections, and subsections
  • Legal documents with articles and clauses
  • Product documentation with categories and details
How Retrieval Works:
  1. Query matches against child chunks (fast, focused)
  2. Returns parent chunks (comprehensive context)
  3. Best of both: precision retrieval + complete context
Pros:
  • Optimal balance of precision and context
  • Excellent for nested documents
  • Better accuracy for complex queries
Cons:
  • More complex configuration
  • Larger storage footprint
  • Requires understanding of document structure
最佳适用场景:带嵌套结构的复杂文档
工作原理:创建父分块与子分块;检索子分块,返回父分块以提供上下文
参数:
  • levelConfigurations
    : 分块大小数组(父→子)
  • overlapTokens
    : 分块间的重叠token数
配置示例:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'HIERARCHICAL',
        'hierarchicalChunkingConfiguration': {
            'levelConfigurations': [
                {
                    'maxTokens': 1500  # 父分块(完整上下文)
                },
                {
                    'maxTokens': 300   # 子分块(聚焦检索)
                }
            ],
            'overlapTokens': 60
        }
    }
}
适用场景:
  • 带章节与小节的技术手册
  • 带摘要、章节与小节的学术论文
  • 带条款的法律文档
  • 带分类与详情的产品文档
检索工作流程:
  1. 查询与子分块匹配(快速、聚焦)
  2. 返回父分块(完整上下文)
  3. 兼顾精准检索与完整上下文的最优方案
优点:
  • 精准度与上下文的最优平衡
  • 适用于嵌套文档
  • 复杂查询的准确性更高
缺点:
  • 配置更复杂
  • 存储占用更大
  • 需要理解文档结构

4. Custom Chunking (Lambda)

4. 自定义分块(Lambda)

Best for: Specialized domain logic, custom parsing requirements
How it works: Invoke Lambda function for custom chunking logic
Configuration:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'NONE'  # Custom via Lambda
    },
    'customTransformationConfiguration': {
        'intermediateStorage': {
            's3Location': {
                'uri': 's3://my-kb-bucket/intermediate/'
            }
        },
        'transformations': [
            {
                'stepToApply': 'POST_CHUNKING',
                'transformationFunction': {
                    'transformationLambdaConfiguration': {
                        'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
                    }
                }
            }
        ]
    }
}
Example Lambda Handler:
python
undefined
最佳适用场景:专业领域逻辑、自定义解析需求
工作原理:调用Lambda函数实现自定义分块逻辑
配置示例:
python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'NONE'  # 通过Lambda实现自定义
    },
    'customTransformationConfiguration': {
        'intermediateStorage': {
            's3Location': {
                'uri': 's3://my-kb-bucket/intermediate/'
            }
        },
        'transformations': [
            {
                'stepToApply': 'POST_CHUNKING',
                'transformationFunction': {
                    'transformationLambdaConfiguration': {
                        'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
                    }
                }
            }
        ]
    }
}
示例Lambda处理器:
python
undefined

Lambda function for custom chunking

Lambda function for custom chunking

import json
def lambda_handler(event, context): """ Custom chunking logic for specialized documents
Input: event contains document content and metadata
Output: array of chunks with text and metadata
"""

# Extract document content
document = event['document']
content = document['content']
metadata = document.get('metadata', {})

# Custom chunking logic (example: split by custom delimiter)
chunks = []
sections = content.split('---SECTION---')

for idx, section in enumerate(sections):
    if section.strip():
        chunks.append({
            'text': section.strip(),
            'metadata': {
                **metadata,
                'chunk_id': f'section_{idx}',
                'chunk_type': 'custom_section'
            }
        })

return {
    'chunks': chunks
}

**Use Cases**:
- Medical records with structured sections (SOAP notes)
- Financial documents with tables and calculations
- Code documentation with code blocks and explanations
- Domain-specific formats (HL7, FHIR, etc.)

**Pros**:
- Complete control over chunking logic
- Can handle any document format
- Integrate domain expertise

**Cons**:
- Requires Lambda development and maintenance
- Additional operational complexity
- Harder to debug and iterate
import json
def lambda_handler(event, context): """ Custom chunking logic for specialized documents
Input: event contains document content and metadata
Output: array of chunks with text and metadata
"""

# Extract document content
document = event['document']
content = document['content']
metadata = document.get('metadata', {})

# Custom chunking logic (example: split by custom delimiter)
chunks = []
sections = content.split('---SECTION---')

for idx, section in enumerate(sections):
    if section.strip():
        chunks.append({
            'text': section.strip(),
            'metadata': {
                **metadata,
                'chunk_id': f'section_{idx}',
                'chunk_type': 'custom_section'
            }
        })

return {
    'chunks': chunks
}

**适用场景**:
- 带结构化章节的医疗记录(SOAP病历)
- 带表格与计算的财务文档
- 带代码块与解释的代码文档
- 领域特定格式(HL7、FHIR等)

**优点**:
- 完全控制分块逻辑
- 可处理任意文档格式
- 集成领域专业知识

**缺点**:
- 需要Lambda开发与维护
- 额外的运维复杂度
- 调试与迭代难度更高

Chunking Strategy Selection Guide

分块策略选择指南

Document TypeRecommended StrategyRationale
Blog posts, articlesFixed-sizeSimple, uniform structure
Legal documentsSemanticPreserve legal reasoning flow
Technical manualsHierarchicalNested sections and subsections
Academic papersHierarchicalAbstract, sections, subsections
FAQsFixed-sizeIndependent Q&A pairs
Medical recordsCustom LambdaStructured sections (SOAP, HL7)
Code documentationCustom LambdaCode blocks + explanations
Product catalogsFixed-sizeUniform product descriptions
Research reportsSemanticPreserve research narrative

文档类型推荐策略理由
博客文章、普通文章固定大小分块结构简单统一
法律文档语义分块保留法律推理逻辑
技术手册分层分块带嵌套章节与小节
学术论文分层分块含摘要、章节与小节
FAQ固定大小分块独立的问答对
医疗记录自定义Lambda分块结构化章节(SOAP、HL7)
代码文档自定义Lambda分块代码块+解释内容
产品目录固定大小分块统一的产品描述
研究报告语义分块保留研究叙事逻辑

Retrieval Operations

检索操作

1. Retrieve API (Retrieval Only)

1. Retrieve API(仅检索)

Returns raw retrieved chunks without generation.
Use Cases:
  • Custom generation logic
  • Debugging retrieval quality
  • Building custom RAG pipelines
  • Integrating with non-Bedrock models
python
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'What are the benefits of hierarchical chunking?'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 5,
            'overrideSearchType': 'HYBRID',  # SEMANTIC, HYBRID
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'technical_guide'
                        }
                    },
                    {
                        'greaterThan': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    }
                ]
            }
        }
    }
)
返回原始检索分块,不包含生成内容。
适用场景:
  • 自定义生成逻辑
  • 调试检索质量
  • 构建自定义RAG流水线
  • 与非Bedrock模型集成
python
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'What are the benefits of hierarchical chunking?'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 5,
            'overrideSearchType': 'HYBRID',  # SEMANTIC, HYBRID
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'technical_guide'
                        }
                    },
                    {
                        'greaterThan': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    }
                ]
            }
        }
    }
)

Process retrieved chunks

Process retrieved chunks

for result in response['retrievalResults']: print(f"Score: {result['score']}") print(f"Content: {result['content']['text']}") print(f"Location: {result['location']}") print(f"Metadata: {result.get('metadata', {})}") print("---")
undefined
for result in response['retrievalResults']: print(f"Score: {result['score']}") print(f"Content: {result['content']['text']}") print(f"Location: {result['location']}") print(f"Metadata: {result.get('metadata', {})}") print("---")
undefined

2. Retrieve and Generate API (RAG)

2. Retrieve and Generate API(RAG)

Returns generated response with source attribution.
Use Cases:
  • Complete RAG workflows
  • Question answering
  • Document summarization
  • Chatbots with knowledge bases
python
response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Explain semantic chunking benefits and when to use it'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            },
            'generationConfiguration': {
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'temperature': 0.7,
                        'maxTokens': 2048,
                        'topP': 0.9
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.

Context: $search_results$

Question: $query$

Answer:'''
                }
            }
        }
    }
)

print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']}")
        print(f"    Relevance Score: {reference.get('score', 'N/A')}")
返回带来源归因的生成响应。
适用场景:
  • 完整RAG工作流
  • 问答场景
  • 文档摘要
  • 集成知识库的聊天机器人
python
response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Explain semantic chunking benefits and when to use it'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            },
            'generationConfiguration': {
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'temperature': 0.7,
                        'maxTokens': 2048,
                        'topP': 0.9
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.

Context: $search_results$

Question: $query$

Answer:'''
                }
            }
        }
    }
)

print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']}")
        print(f"    Relevance Score: {reference.get('score', 'N/A')}")

3. Multi-Turn Conversations with Session Management

3. 带会话管理的多轮对话

Bedrock automatically manages conversation context across turns.
python
undefined
Bedrock会自动管理多轮对话的上下文。
python
undefined

First turn - creates session automatically

第一轮 - 自动创建会话

response1 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What is Amazon Bedrock Knowledge Bases?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } } )
session_id = response1['sessionId'] print(f"Session ID: {session_id}") print(f"Response: {response1['output']['text']}\n")
response1 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What is Amazon Bedrock Knowledge Bases?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } } )
session_id = response1['sessionId'] print(f"Session ID: {session_id}") print(f"Response: {response1['output']['text']}\n")

Follow-up turn - reuse session for context

后续轮次 - 复用会话以保留上下文

response2 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What chunking strategies does it support?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } }, sessionId=session_id # Continue conversation with context )
print(f"Follow-up Response: {response2['output']['text']}")
response2 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'What chunking strategies does it support?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } }, sessionId=session_id # 复用会话上下文 )
print(f"Follow-up Response: {response2['output']['text']}")

Third turn

第三轮

response3 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'Which strategy would you recommend for legal documents?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } }, sessionId=session_id )
print(f"Third Response: {response3['output']['text']}")
undefined
response3 = bedrock_agent_runtime.retrieve_and_generate( input={ 'text': 'Which strategy would you recommend for legal documents?' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'KB123456', 'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0' } }, sessionId=session_id )
print(f"Third Response: {response3['output']['text']}")
undefined

4. Advanced Metadata Filtering

4. 高级元数据过滤

Filter retrieval by metadata attributes for precision.
python
response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'Security best practices for production deployments'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 10,
            'overrideSearchType': 'HYBRID',
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'security_guide'
                        }
                    },
                    {
                        'greaterThanOrEquals': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    },
                    {
                        'in': {
                            'key': 'category',
                            'value': ['production', 'security', 'compliance']
                        }
                    }
                ]
            }
        }
    }
)
Supported Filter Operators:
  • equals
    : Exact match
  • notEquals
    : Not equal
  • greaterThan
    ,
    greaterThanOrEquals
    : Numeric comparison
  • lessThan
    ,
    lessThanOrEquals
    : Numeric comparison
  • in
    : Match any value in array
  • notIn
    : Not match any value in array
  • startsWith
    : String prefix match
  • andAll
    : Combine filters with AND
  • orAll
    : Combine filters with OR

通过元数据属性过滤检索结果以提升精准度。
python
response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'Security best practices for production deployments'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 10,
            'overrideSearchType': 'HYBRID',
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'security_guide'
                        }
                    },
                    {
                        'greaterThanOrEquals': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    },
                    {
                        'in': {
                            'key': 'category',
                            'value': ['production', 'security', 'compliance']
                        }
                    }
                ]
            }
        }
    }
)
支持的过滤操作符:
  • equals
    : 精确匹配
  • notEquals
    : 不匹配
  • greaterThan
    ,
    greaterThanOrEquals
    : 数值比较
  • lessThan
    ,
    lessThanOrEquals
    : 数值比较
  • in
    : 匹配数组中的任意值
  • notIn
    : 不匹配数组中的任意值
  • startsWith
    : 字符串前缀匹配
  • andAll
    : 用AND组合多个过滤条件
  • orAll
    : 用OR组合多个过滤条件

Ingestion Management

导入管理

1. Start Ingestion Job

1. 启动导入任务

python
ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Monthly document sync',
    clientToken='unique-idempotency-token-123'
)

job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")
python
ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Monthly document sync',
    clientToken='unique-idempotency-token-123'
)

job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")

2. Monitor Ingestion Job

2. 监控导入任务

python
undefined
python
undefined

Get job status

获取任务状态

job_status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )
print(f"Status: {job_status['ingestionJob']['status']}") print(f"Started: {job_status['ingestionJob']['startedAt']}") print(f"Updated: {job_status['ingestionJob']['updatedAt']}")
if 'statistics' in job_status['ingestionJob']: stats = job_status['ingestionJob']['statistics'] print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}") print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}") print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")
job_status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )
print(f"Status: {job_status['ingestionJob']['status']}") print(f"Started: {job_status['ingestionJob']['startedAt']}") print(f"Updated: {job_status['ingestionJob']['updatedAt']}")
if 'statistics' in job_status['ingestionJob']: stats = job_status['ingestionJob']['statistics'] print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}") print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}") print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")

Wait for completion

等待任务完成

import time
while True: status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )
current_status = status['ingestionJob']['status']

if current_status in ['COMPLETE', 'FAILED']:
    print(f"Ingestion job {current_status}")
    break

print(f"Status: {current_status}, waiting...")
time.sleep(30)
undefined
import time
while True: status = bedrock_agent.get_ingestion_job( knowledgeBaseId=knowledge_base_id, dataSourceId=data_source_id, ingestionJobId=job_id )
current_status = status['ingestionJob']['status']

if current_status in ['COMPLETE', 'FAILED']:
    print(f"Ingestion job {current_status}")
    break

print(f"Status: {current_status}, waiting...")
time.sleep(30)
undefined

3. List Ingestion Jobs

3. 列出导入任务

python
list_response = bedrock_agent.list_ingestion_jobs(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    maxResults=50
)

for job in list_response['ingestionJobSummaries']:
    print(f"Job ID: {job['ingestionJobId']}")
    print(f"Status: {job['status']}")
    print(f"Started: {job['startedAt']}")
    print(f"Updated: {job['updatedAt']}")
    print("---")

python
list_response = bedrock_agent.list_ingestion_jobs(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    maxResults=50
)

for job in list_response['ingestionJobSummaries']:
    print(f"Job ID: {job['ingestionJobId']}")
    print(f"Status: {job['status']}")
    print(f"Started: {job['startedAt']}")
    print(f"Updated: {job['updatedAt']}")
    print("---")

Integration with Bedrock Agents

与Bedrock Agent集成

1. Agent with Knowledge Base Action

1. 集成知识库的Agent

python
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
python
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

Create agent with knowledge base

创建集成知识库的Agent

agent_response = bedrock_agent.create_agent( agentName='customer-support-agent', description='Customer support agent with knowledge base access', instruction='''You are a customer support agent. When answering questions:
  1. Search the knowledge base for relevant information
  2. Provide accurate answers based on retrieved context
  3. Cite your sources
  4. Admit when you don't know something''', foundationModel='anthropic.claude-3-sonnet-20240229-v1:0', agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole' )
agent_id = agent_response['agent']['agentId']
agent_response = bedrock_agent.create_agent( agentName='customer-support-agent', description='Customer support agent with knowledge base access', instruction='''You are a customer support agent. When answering questions:
  1. Search the knowledge base for relevant information
  2. Provide accurate answers based on retrieved context
  3. Cite your sources
  4. Admit when you don't know something''', foundationModel='anthropic.claude-3-sonnet-20240229-v1:0', agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole' )
agent_id = agent_response['agent']['agentId']

Associate knowledge base with agent

关联知识库与Agent

kb_association = bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB123456', description='Company documentation knowledge base', knowledgeBaseState='ENABLED' )
kb_association = bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB123456', description='Company documentation knowledge base', knowledgeBaseState='ENABLED' )

Prepare and create alias

准备并创建别名

bedrock_agent.prepare_agent(agentId=agent_id)
alias_response = bedrock_agent.create_agent_alias( agentId=agent_id, agentAliasName='production', description='Production alias' )
agent_alias_id = alias_response['agentAlias']['agentAliasId']
bedrock_agent.prepare_agent(agentId=agent_id)
alias_response = bedrock_agent.create_agent_alias( agentId=agent_id, agentAliasName='production', description='Production alias' )
agent_alias_id = alias_response['agentAlias']['agentAliasId']

Invoke agent (automatically queries knowledge base)

调用Agent(自动查询知识库)

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.invoke_agent( agentId=agent_id, agentAliasId=agent_alias_id, sessionId='session-123', inputText='What is our return policy for defective products?' )
for event in response['completion']: if 'chunk' in event: chunk = event['chunk'] print(chunk['bytes'].decode())
undefined
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
response = bedrock_agent_runtime.invoke_agent( agentId=agent_id, agentAliasId=agent_alias_id, sessionId='session-123', inputText='What is our return policy for defective products?' )
for event in response['completion']: if 'chunk' in event: chunk = event['chunk'] print(chunk['bytes'].decode())
undefined

2. Agent with Multiple Knowledge Bases

2. 集成多知识库的Agent

python
undefined
python
undefined

Associate multiple knowledge bases

关联多个知识库

bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-PRODUCT-DOCS', description='Product documentation' )
bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-SUPPORT-ARTICLES', description='Support knowledge articles' )
bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-COMPANY-POLICIES', description='Company policies and procedures' )
bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-PRODUCT-DOCS', description='Product documentation' )
bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-SUPPORT-ARTICLES', description='Support knowledge articles' )
bedrock_agent.associate_agent_knowledge_base( agentId=agent_id, agentVersion='DRAFT', knowledgeBaseId='KB-COMPANY-POLICIES', description='Company policies and procedures' )

Agent automatically searches all knowledge bases and combines results

Agent会自动搜索所有知识库并合并结果


---

---

Best Practices

最佳实践

1. Chunking Strategy Selection

1. 分块策略选择

Decision Framework:
  1. Simple, uniform documents → Fixed-size chunking
    • Blog posts, articles, simple FAQs
    • Fast, predictable, cost-effective
  2. Documents without clear boundaries → Semantic chunking
    • Legal documents, contracts, academic papers
    • Preserves semantic meaning, better accuracy
    • Consider additional cost
  3. Nested, hierarchical documents → Hierarchical chunking
    • Technical manuals, product docs, research papers
    • Best balance of precision and context
    • Optimal for complex structures
  4. Specialized formats → Custom Lambda chunking
    • Medical records (HL7, FHIR), code docs, custom formats
    • Complete control, domain expertise
    • Higher operational complexity
Tuning Guidelines:
  • Fixed-size: Start with 512 tokens, 20% overlap
  • Semantic: Start with 300 tokens, bufferSize=1, threshold=95%
  • Hierarchical: Parent 1500 tokens, child 300 tokens, overlap 60 tokens
  • Custom: Test extensively with domain experts
决策框架:
  1. 简单、统一结构文档 → 固定大小分块
    • 博客文章、普通文章、简单FAQ
    • 快速、可预测、成本低
  2. 无明确边界的文档 → 语义分块
    • 法律文档、合同、学术论文
    • 保留语义含义,准确性更高
    • 需考虑额外成本
  3. 嵌套、分层结构文档 → 分层分块
    • 技术手册、产品文档、研究论文
    • 精准度与上下文的最优平衡
    • 适用于复杂结构
  4. 专业格式文档 → 自定义Lambda分块
    • 医疗记录(HL7、FHIR)、代码文档、自定义格式
    • 完全可控,集成领域知识
    • 运维复杂度更高
调优指南:
  • 固定大小分块:从512 token、20%重叠开始
  • 语义分块:从300 token、bufferSize=1、阈值95%开始
  • 分层分块:父分块1500 token、子分块300 token、重叠60 token
  • 自定义分块:与领域专家协作进行充分测试

2. Retrieval Optimization

2. 检索优化

Number of Results:
  • Start with 5-10 results
  • Increase if answers lack detail
  • Decrease if too much noise
Search Type:
  • SEMANTIC: Pure vector similarity (faster, good for conceptual queries)
  • HYBRID: Vector + keyword (better recall, recommended for production)
Use Hybrid Search when:
  • Queries contain specific terms or names
  • Need to match exact keywords
  • Domain has specialized vocabulary
Use Semantic Search when:
  • Purely conceptual queries
  • Prioritizing speed over perfect recall
  • Well-embedded domain knowledge
Metadata Filters:
  • Always use when applicable
  • Dramatically improves precision
  • Reduces retrieval latency
  • Examples: document_type, publish_date, category, author
结果数量:
  • 从5-10个结果开始
  • 若答案缺乏细节则增加数量
  • 若结果噪音过多则减少数量
搜索类型:
  • SEMANTIC:纯向量相似度(更快,适用于概念类查询)
  • HYBRID:向量+关键词(召回率更高,生产环境推荐)
使用混合搜索的场景:
  • 查询包含特定术语或名称
  • 需要匹配精确关键词
  • 领域含专业词汇
使用语义搜索的场景:
  • 纯概念类查询
  • 优先考虑速度而非完美召回率
  • 领域知识已充分嵌入
元数据过滤:
  • 适用时务必使用
  • 大幅提升精准度
  • 降低检索延迟
  • 示例:document_type、publish_date、category、author

3. Cost Optimization

3. 成本优化

S3 Vectors:
  • Use for large-scale knowledge bases (millions of chunks)
  • Up to 90% cost savings vs. OpenSearch
  • Ideal for cost-sensitive applications
  • Trade-off: Slightly higher latency
Semantic Chunking:
  • Incurs foundation model costs during ingestion
  • Consider cost vs. accuracy benefit
  • May not be worth it for simple documents
  • Best for complex, high-value content
Ingestion Frequency:
  • Schedule ingestion during off-peak hours
  • Use incremental updates when possible
  • Don't re-ingest unchanged documents
Model Selection:
  • Use smaller embedding models when accuracy permits
  • Titan Embed Text v2 is cost-effective
  • Consider Cohere Embed for multilingual
Token Usage:
  • Monitor generation token usage
  • Set appropriate maxTokens limits
  • Use prompt templates to control verbosity
S3 Vectors:
  • 适用于大规模知识库(数百万分块)
  • 相比OpenSearch节省高达90%的成本
  • 适用于对成本敏感的应用
  • 权衡点:延迟略高
语义分块:
  • 导入阶段会产生基础模型费用
  • 需权衡成本与准确性收益
  • 简单文档可能不值得额外成本
  • 适用于复杂、高价值内容
导入频率:
  • 在非高峰时段调度导入任务
  • 尽可能使用增量更新
  • 不要重新导入未修改的文档
模型选择:
  • 在准确性允许的情况下使用更小的嵌入模型
  • Titan Embed Text v2性价比高
  • 多语言场景可考虑Cohere Embed
Token使用:
  • 监控生成token的使用量
  • 设置合理的maxTokens限制
  • 使用提示模板控制输出冗长程度

4. Session Management

4. 会话管理

Always Reuse Sessions:
  • Pass
    sessionId
    for follow-up turns
  • Bedrock handles context automatically
  • No manual conversation history needed
Session Lifecycle:
  • Sessions expire after inactivity (default: 60 minutes)
  • Create new session for unrelated conversations
  • Use unique sessionId per user/conversation
Context Limits:
  • Monitor conversation length
  • Long sessions may hit context limits
  • Consider summarization for very long conversations
务必复用会话:
  • 后续轮次传递
    sessionId
  • Bedrock自动处理上下文
  • 无需手动维护对话历史
会话生命周期:
  • 会话在闲置后过期(默认:60分钟)
  • 无关对话创建新会话
  • 为每个用户/对话使用唯一的sessionId
上下文限制:
  • 监控对话长度
  • 长会话可能触发上下文限制
  • 超长对话可考虑摘要处理

5. GraphRAG with Neptune

5. 基于Neptune的GraphRAG

When to Use:
  • Interconnected knowledge domains
  • Relationship-aware queries
  • Need for explainability
  • Complex knowledge graphs
Benefits:
  • Automatic graph creation
  • Improved accuracy through relationships
  • Comprehensive answers
  • Explainable results
Considerations:
  • Higher setup complexity
  • Neptune Analytics costs
  • Best for domains with rich relationships
适用场景:
  • 互联知识领域
  • 关系感知类查询
  • 需要可解释性
  • 复杂知识图谱
优势:
  • 自动创建图谱
  • 通过关系提升准确性
  • 生成全面回答
  • 结果可解释
注意事项:
  • 配置复杂度更高
  • Neptune Analytics有额外成本
  • 适用于关系丰富的领域

6. Data Source Management

6. 数据源管理

S3 Best Practices:
  • Organize with clear prefixes
  • Use inclusion/exclusion filters
  • Maintain consistent metadata
  • Version documents when updating
Web Crawler:
  • Set appropriate rate limits
  • Use robots.txt for guidance
  • Monitor for broken links
  • Schedule regular re-crawls
Confluence/SharePoint:
  • Filter by spaces/sites
  • Exclude archived content
  • Use fine-grained permissions
  • Schedule incremental syncs
Metadata Enrichment:
  • Add custom metadata to documents
  • Include: document_type, publish_date, category, author, version
  • Enables powerful filtering
  • Improves retrieval precision
S3最佳实践:
  • 用清晰的前缀组织内容
  • 使用包含/排除过滤
  • 保持元数据一致性
  • 更新文档时使用版本控制
网页爬虫:
  • 设置合理的速率限制
  • 参考robots.txt
  • 监控失效链接
  • 定期调度重新抓取
Confluence/SharePoint:
  • 按空间/站点过滤
  • 排除归档内容
  • 使用细粒度权限
  • 调度增量同步
元数据增强:
  • 为文档添加自定义元数据
  • 包含:document_type、publish_date、category、author、version
  • 实现强大的过滤能力
  • 提升检索精准度

7. Monitoring and Debugging

7. 监控与调试

Enable CloudWatch Logs:
python
undefined
启用CloudWatch日志:
python
undefined

Monitor retrieval quality

监控检索质量

Track: query latency, retrieval scores, generation quality

跟踪:查询延迟、检索分数、生成质量

Set alarms for: high latency, low scores, high error rates

设置告警:高延迟、低分数、高错误率


**Test Retrieval Quality**:
```python

**测试检索质量**:
```python

Use retrieve API to debug

使用retrieve API调试

response = bedrock_agent_runtime.retrieve( knowledgeBaseId='KB123456', retrievalQuery={'text': 'test query'} )
response = bedrock_agent_runtime.retrieve( knowledgeBaseId='KB123456', retrievalQuery={'text': 'test query'} )

Analyze retrieval scores

分析检索分数

for result in response['retrievalResults']: print(f"Score: {result['score']}") print(f"Content preview: {result['content']['text'][:200]}")

**Common Issues**:

1. **Low Retrieval Scores**:
   - Check chunking strategy
   - Verify embedding model
   - Ensure documents are properly ingested
   - Consider semantic or hierarchical chunking

2. **Irrelevant Results**:
   - Add metadata filters
   - Use hybrid search
   - Refine chunking strategy
   - Increase numberOfResults

3. **Missing Information**:
   - Verify data source configuration
   - Check ingestion job status
   - Ensure documents are not excluded by filters
   - Increase numberOfResults

4. **Slow Retrieval**:
   - Use metadata filters to narrow scope
   - Optimize vector database configuration
   - Consider S3 Vectors for cost over latency
   - Reduce numberOfResults
for result in response['retrievalResults']: print(f"Score: {result['score']}") print(f"Content preview: {result['content']['text'][:200]}")

**常见问题**:

1. **检索分数低**:
   - 检查分块策略
   - 验证嵌入模型
   - 确保文档已正确导入
   - 考虑使用语义或分层分块

2. **结果不相关**:
   - 添加元数据过滤
   - 使用混合搜索
   - 优化分块策略
   - 增加结果数量

3. **信息缺失**:
   - 验证数据源配置
   - 检查导入任务状态
   - 确保文档未被过滤排除
   - 增加结果数量

4. **检索缓慢**:
   - 使用元数据过滤缩小范围
   - 优化向量数据库配置
   - 考虑用S3 Vectors平衡成本与延迟
   - 减少结果数量

8. Security Best Practices

8. 安全最佳实践

IAM Permissions:
  • Use least privilege for Knowledge Base role
  • Separate roles for data sources, ingestion, retrieval
  • Enable VPC endpoints for private connectivity
Data Encryption:
  • All data encrypted at rest (AWS KMS)
  • Data encrypted in transit (TLS)
  • Use customer-managed KMS keys for compliance
Access Control:
  • Use IAM policies to control who can query
  • Implement fine-grained access control
  • Monitor access with CloudTrail
PII Handling:
  • Use Bedrock Guardrails for PII redaction
  • Implement data masking for sensitive fields
  • Consider custom Lambda for advanced PII handling

IAM权限:
  • 为知识库角色使用最小权限原则
  • 为数据源、导入、检索分别设置独立角色
  • 启用VPC端点实现私有连接
数据加密:
  • 所有数据静态加密(AWS KMS)
  • 传输中数据加密(TLS)
  • 为合规性使用客户管理的KMS密钥
访问控制:
  • 使用IAM策略控制查询权限
  • 实现细粒度访问控制
  • 用CloudTrail监控访问
PII处理:
  • 使用Bedrock Guardrails进行PII脱敏
  • 对敏感字段实现数据掩码
  • 考虑用自定义Lambda实现高级PII处理

Complete Production Example

完整生产示例

End-to-End RAG Application

端到端RAG应用

python
import boto3
import json
from typing import List, Dict, Optional

class BedrockKnowledgeBaseRAG:
    """Production RAG application with Amazon Bedrock Knowledge Bases"""

    def __init__(self, region_name: str = 'us-east-1'):
        self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
        self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)

    def create_knowledge_base(
        self,
        name: str,
        description: str,
        role_arn: str,
        vector_store_config: Dict,
        embedding_model: str = 'amazon.titan-embed-text-v2:0'
    ) -> str:
        """Create knowledge base with vector store"""

        response = self.bedrock_agent.create_knowledge_base(
            name=name,
            description=description,
            roleArn=role_arn,
            knowledgeBaseConfiguration={
                'type': 'VECTOR',
                'vectorKnowledgeBaseConfiguration': {
                    'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
                }
            },
            storageConfiguration=vector_store_config
        )

        return response['knowledgeBase']['knowledgeBaseId']

    def add_s3_data_source(
        self,
        knowledge_base_id: str,
        name: str,
        bucket_arn: str,
        inclusion_prefixes: List[str],
        chunking_strategy: str = 'FIXED_SIZE',
        chunking_config: Optional[Dict] = None
    ) -> str:
        """Add S3 data source with chunking configuration"""

        if chunking_config is None:
            chunking_config = {
                'maxTokens': 512,
                'overlapPercentage': 20
            }

        vector_ingestion_config = {
            'chunkingConfiguration': {
                'chunkingStrategy': chunking_strategy
            }
        }

        if chunking_strategy == 'FIXED_SIZE':
            vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'SEMANTIC':
            vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'HIERARCHICAL':
            vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config

        response = self.bedrock_agent.create_data_source(
            knowledgeBaseId=knowledge_base_id,
            name=name,
            description=f'S3 data source: {name}',
            dataSourceConfiguration={
                'type': 'S3',
                's3Configuration': {
                    'bucketArn': bucket_arn,
                    'inclusionPrefixes': inclusion_prefixes
                }
            },
            vectorIngestionConfiguration=vector_ingestion_config
        )

        return response['dataSource']['dataSourceId']

    def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
        """Start ingestion job and wait for completion"""

        import time

        # Start ingestion
        response = self.bedrock_agent.start_ingestion_job(
            knowledgeBaseId=knowledge_base_id,
            dataSourceId=data_source_id,
            description='Automated ingestion'
        )

        job_id = response['ingestionJob']['ingestionJobId']

        # Wait for completion
        while True:
            status_response = self.bedrock_agent.get_ingestion_job(
                knowledgeBaseId=knowledge_base_id,
                dataSourceId=data_source_id,
                ingestionJobId=job_id
            )

            status = status_response['ingestionJob']['status']

            if status == 'COMPLETE':
                print(f"Ingestion completed successfully")
                if 'statistics' in status_response['ingestionJob']:
                    stats = status_response['ingestionJob']['statistics']
                    print(f"Documents indexed: {stats.get('numberOfDocumentsIndexed', 0)}")
                break
            elif status == 'FAILED':
                print(f"Ingestion failed")
                break

            print(f"Ingestion status: {status}")
            time.sleep(30)

        return job_id

    def query(
        self,
        knowledge_base_id: str,
        query: str,
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
        num_results: int = 5,
        search_type: str = 'HYBRID',
        metadata_filter: Optional[Dict] = None,
        session_id: Optional[str] = None
    ) -> Dict:
        """Query knowledge base with retrieve and generate"""

        retrieval_config = {
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': knowledge_base_id,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': num_results,
                        'overrideSearchType': search_type
                    }
                },
                'generationConfiguration': {
                    'inferenceConfig': {
                        'textInferenceConfig': {
                            'temperature': 0.7,
                            'maxTokens': 2048
                        }
                    }
                }
            }
        }

        # Add metadata filter if provided
        if metadata_filter:
            retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter

        # Build request
        request = {
            'input': {'text': query},
            'retrieveAndGenerateConfiguration': retrieval_config
        }

        # Add session if provided
        if session_id:
            request['sessionId'] = session_id

        response = self.bedrock_agent_runtime.retrieve_and_generate(**request)

        return {
            'answer': response['output']['text'],
            'citations': response.get('citations', []),
            'session_id': response['sessionId']
        }

    def multi_turn_conversation(
        self,
        knowledge_base_id: str,
        queries: List[str],
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
    ) -> List[Dict]:
        """Execute multi-turn conversation with context"""

        session_id = None
        conversation = []

        for query in queries:
            result = self.query(
                knowledge_base_id=knowledge_base_id,
                query=query,
                model_arn=model_arn,
                session_id=session_id
            )

            session_id = result['session_id']

            conversation.append({
                'query': query,
                'answer': result['answer'],
                'citations': result['citations']
            })

        return conversation
python
import boto3
import json
from typing import List, Dict, Optional

class BedrockKnowledgeBaseRAG:
    """基于Amazon Bedrock知识库的生产级RAG应用"""

    def __init__(self, region_name: str = 'us-east-1'):
        self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
        self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)

    def create_knowledge_base(
        self,
        name: str,
        description: str,
        role_arn: str,
        vector_store_config: Dict,
        embedding_model: str = 'amazon.titan-embed-text-v2:0'
    ) -> str:
        """创建带向量存储的知识库"""

        response = self.bedrock_agent.create_knowledge_base(
            name=name,
            description=description,
            roleArn=role_arn,
            knowledgeBaseConfiguration={
                'type': 'VECTOR',
                'vectorKnowledgeBaseConfiguration': {
                    'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
                }
            },
            storageConfiguration=vector_store_config
        )

        return response['knowledgeBase']['knowledgeBaseId']

    def add_s3_data_source(
        self,
        knowledge_base_id: str,
        name: str,
        bucket_arn: str,
        inclusion_prefixes: List[str],
        chunking_strategy: str = 'FIXED_SIZE',
        chunking_config: Optional[Dict] = None
    ) -> str:
        """添加带分块配置的S3数据源"""

        if chunking_config is None:
            chunking_config = {
                'maxTokens': 512,
                'overlapPercentage': 20
            }

        vector_ingestion_config = {
            'chunkingConfiguration': {
                'chunkingStrategy': chunking_strategy
            }
        }

        if chunking_strategy == 'FIXED_SIZE':
            vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'SEMANTIC':
            vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'HIERARCHICAL':
            vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config

        response = self.bedrock_agent.create_data_source(
            knowledgeBaseId=knowledge_base_id,
            name=name,
            description=f'S3 data source: {name}',
            dataSourceConfiguration={
                'type': 'S3',
                's3Configuration': {
                    'bucketArn': bucket_arn,
                    'inclusionPrefixes': inclusion_prefixes
                }
            },
            vectorIngestionConfiguration=vector_ingestion_config
        )

        return response['dataSource']['dataSourceId']

    def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
        """启动导入任务并等待完成"""

        import time

        # 启动导入
        response = self.bedrock_agent.start_ingestion_job(
            knowledgeBaseId=knowledge_base_id,
            dataSourceId=data_source_id,
            description='Automated ingestion'
        )

        job_id = response['ingestionJob']['ingestionJobId']

        # 等待完成
        while True:
            status_response = self.bedrock_agent.get_ingestion_job(
                knowledgeBaseId=knowledge_base_id,
                dataSourceId=data_source_id,
                ingestionJobId=job_id
            )

            status = status_response['ingestionJob']['status']

            if status == 'COMPLETE':
                print(f"导入任务成功完成")
                if 'statistics' in status_response['ingestionJob']:
                    stats = status_response['ingestionJob']['statistics']
                    print(f"已扫描文档数: {stats.get('numberOfDocumentsScanned', 0)}")
                break
            elif status == 'FAILED':
                print(f"导入任务失败")
                break

            print(f"导入任务状态: {status},等待中...")
            time.sleep(30)

        return job_id

    def query(
        self,
        knowledge_base_id: str,
        query: str,
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
        num_results: int = 5,
        search_type: str = 'HYBRID',
        metadata_filter: Optional[Dict] = None,
        session_id: Optional[str] = None
    ) -> Dict:
        """通过检索与生成API查询知识库"""

        retrieval_config = {
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': knowledge_base_id,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': num_results,
                        'overrideSearchType': search_type
                    }
                },
                'generationConfiguration': {
                    'inferenceConfig': {
                        'textInferenceConfig': {
                            'temperature': 0.7,
                            'maxTokens': 2048
                        }
                    }
                }
            }
        }

        # 添加元数据过滤(如果提供)
        if metadata_filter:
            retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter

        # 构建请求
        request = {
            'input': {'text': query},
            'retrieveAndGenerateConfiguration': retrieval_config
        }

        # 添加会话(如果提供)
        if session_id:
            request['sessionId'] = session_id

        response = self.bedrock_agent_runtime.retrieve_and_generate(**request)

        return {
            'answer': response['output']['text'],
            'citations': response.get('citations', []),
            'session_id': response['sessionId']
        }

    def multi_turn_conversation(
        self,
        knowledge_base_id: str,
        queries: List[str],
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
    ) -> List[Dict]:
        """执行带上下文的多轮对话"""

        session_id = None
        conversation = []

        for query in queries:
            result = self.query(
                knowledge_base_id=knowledge_base_id,
                query=query,
                model_arn=model_arn,
                session_id=session_id
            )

            session_id = result['session_id']

            conversation.append({
                'query': query,
                'answer': result['answer'],
                'citations': result['citations']
            })

        return conversation

Example Usage

示例用法

if name == 'main': rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')
# Create knowledge base
kb_id = rag.create_knowledge_base(
    name='production-docs-kb',
    description='Production documentation knowledge base',
    role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
    vector_store_config={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
            'vectorIndexName': 'bedrock-kb-index',
            'fieldMapping': {
                'vectorField': 'bedrock-knowledge-base-default-vector',
                'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                'metadataField': 'AMAZON_BEDROCK_METADATA'
            }
        }
    }
)

# Add data source
ds_id = rag.add_s3_data_source(
    knowledge_base_id=kb_id,
    name='technical-docs',
    bucket_arn='arn:aws:s3:::my-docs-bucket',
    inclusion_prefixes=['docs/'],
    chunking_strategy='HIERARCHICAL',
    chunking_config={
        'levelConfigurations': [
            {'maxTokens': 1500},
            {'maxTokens': 300}
        ],
        'overlapTokens': 60
    }
)

# Ingest data
rag.ingest_data(kb_id, ds_id)

# Single query
result = rag.query(
    knowledge_base_id=kb_id,
    query='What are the best practices for RAG applications?',
    metadata_filter={
        'equals': {
            'key': 'document_type',
            'value': 'best_practices'
        }
    }
)

print(f"Answer: {result['answer']}")
print(f"\nSources:")
for citation in result['citations']:
    for ref in citation['retrievedReferences']:
        print(f"  - {ref['location']}")

# Multi-turn conversation
conversation = rag.multi_turn_conversation(
    knowledge_base_id=kb_id,
    queries=[
        'What is hierarchical chunking?',
        'When should I use it?',
        'What are the configuration parameters?'
    ]
)

for turn in conversation:
    print(f"\nQ: {turn['query']}")
    print(f"A: {turn['answer']}")

---
if name == 'main': rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')
# 创建知识库
kb_id = rag.create_knowledge_base(
    name='production-docs-kb',
    description='生产文档知识库',
    role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
    vector_store_config={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
            'vectorIndexName': 'bedrock-kb-index',
            'fieldMapping': {
                'vectorField': 'bedrock-knowledge-base-default-vector',
                'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                'metadataField': 'AMAZON_BEDROCK_METADATA'
            }
        }
    }
)

# 添加数据源
ds_id = rag.add_s3_data_source(
    knowledge_base_id=kb_id,
    name='technical-docs',
    bucket_arn='arn:aws:s3:::my-docs-bucket',
    inclusion_prefixes=['docs/'],
    chunking_strategy='HIERARCHICAL',
    chunking_config={
        'levelConfigurations': [
            {'maxTokens': 1500},
            {'maxTokens': 300}
        ],
        'overlapTokens': 60
    }
)

# 导入数据
rag.ingest_data(kb_id, ds_id)

# 单查询
result = rag.query(
    knowledge_base_id=kb_id,
    query='RAG应用的最佳实践有哪些?',
    metadata_filter={
        'equals': {
            'key': 'document_type',
            'value': 'best_practices'
        }
    }
)

print(f"答案: {result['answer']}")
print(f"\n来源:")
for citation in result['citations']:
    for ref in citation['retrievedReferences']:
        print(f"  - {ref['location']}")

# 多轮对话
conversation = rag.multi_turn_conversation(
    knowledge_base_id=kb_id,
    queries=[
        '什么是分层分块?',
        '什么时候应该使用它?',
        '配置参数有哪些?'
    ]
)

for turn in conversation:
    print(f"\n问: {turn['query']}")
    print(f"答: {turn['answer']}")

---

Related Skills

相关能力

Amazon Bedrock Core Skills

Amazon Bedrock核心能力

  • bedrock-guardrails: Content safety, PII redaction, hallucination detection
  • bedrock-agents: Agentic workflows with tool use and knowledge bases
  • bedrock-flows: Visual workflow builder for generative AI
  • bedrock-model-customization: Fine-tuning, reinforcement fine-tuning, distillation
  • bedrock-prompt-management: Prompt versioning and deployment
  • bedrock-guardrails:内容安全、PII脱敏、幻觉检测
  • bedrock-agents:集成工具与知识库的Agent工作流
  • bedrock-flows:生成式AI可视化工作流构建器
  • bedrock-model-customization:微调、强化微调、模型蒸馏
  • bedrock-prompt-management:提示词版本管理与部署

AWS Infrastructure Skills

AWS基础设施能力

  • opensearch-serverless: Vector database configuration and management
  • neptune-analytics: GraphRAG configuration and queries
  • s3-management: S3 bucket configuration for data sources and vectors
  • iam-bedrock: IAM roles and policies for Knowledge Bases
  • opensearch-serverless:向量数据库配置与管理
  • neptune-analytics:GraphRAG配置与查询
  • s3-management:用于数据源与向量的S3桶配置
  • iam-bedrock:知识库相关的IAM角色与策略

Observability Skills

可观测性能力

  • cloudwatch-bedrock-monitoring: Monitor Knowledge Bases metrics and logs
  • bedrock-cost-optimization: Track and optimize Knowledge Bases costs

  • cloudwatch-bedrock-monitoring:监控知识库指标与日志
  • bedrock-cost-optimization:跟踪与优化知识库成本

Additional Resources

额外资源

Official Documentation

官方文档

Best Practices

最佳实践

Research Document

研究文档

  • /mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md
    - Section 2 (Complete Knowledge Bases research)
  • /mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md
    - 第2部分(完整知识库研究)