bedrock-fine-tuning
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAmazon Bedrock Model Customization
Amazon Bedrock 模型定制
Complete guide to customizing Amazon Bedrock foundation models through fine-tuning, continued pre-training, reinforcement fine-tuning, and distillation.
本指南详细介绍如何通过微调、持续预训练、强化微调以及蒸馏技术定制Amazon Bedrock基础模型。
Overview
概述
Amazon Bedrock Model Customization allows you to adapt foundation models to your specific use cases without managing infrastructure. Four customization approaches are available:
Amazon Bedrock模型定制功能让您无需管理基础设施,即可将基础模型适配到特定使用场景。目前提供四种定制方式:
1. Fine-Tuning (Supervised Learning)
1. 微调(监督学习)
Adapt models to specific tasks using labeled examples (input-output pairs). Best for:
- Task-specific optimization (classification, extraction, generation)
- Improving responses for domain terminology
- Teaching specific output formats
- Typical gains: 20-40% accuracy improvement
使用标注示例(输入-输出对)将模型适配到特定任务。最适用于:
- 特定任务优化(分类、提取、生成)
- 提升领域术语响应能力
- 教授特定输出格式
- 典型收益:准确率提升20-40%
2. Continued Pre-Training (Domain Adaptation)
2. 持续预训练(领域适配)
Continue training on unlabeled domain-specific text to build domain knowledge. Best for:
- Medical, legal, financial, technical domains
- Proprietary knowledge bases
- Industry-specific language
- Typical gains: 15-30% domain accuracy improvement
在未标注的特定领域文本上继续训练,构建领域知识。最适用于:
- 医疗、法律、金融、技术等领域
- 专有知识库
- 行业专属语言
- 典型收益:领域准确率提升15-30%
3. Reinforcement Fine-Tuning (NEW 2025)
3. 强化微调(2025年新功能)
Use reinforcement learning with human feedback (RLHF) or AI feedback (RLAIF) for alignment. Best for:
- Improving response quality and safety
- Aligning to brand voice and values
- Reducing hallucinations
- Typical gains: 40-66% accuracy improvement (AWS announced 66% gains in 2025)
结合人类反馈强化学习(RLHF)或AI反馈强化学习(RLAIF)实现模型对齐。最适用于:
- 提升响应质量与安全性
- 对齐品牌语气与价值观
- 减少幻觉现象
- 典型收益:准确率提升40-66%(AWS在2025年宣布可实现66%的提升)
4. Distillation (Teacher-Student)
4. 蒸馏(师生模型法)
Transfer knowledge from larger models to smaller, faster models. Best for:
- Cost optimization (smaller models are cheaper)
- Latency reduction (faster inference)
- Maintaining quality while reducing size
- Typical gains: 80-90% of teacher model quality at 50-70% cost reduction
将大模型的知识迁移到更小、更快的模型中。最适用于:
- 成本优化(小模型成本更低)
- 降低延迟(推理速度更快)
- 在缩小模型规模的同时保持质量
- 典型收益:保留教师模型80-90%的质量,同时降低50-70%的成本
Supported Models
支持的模型
| Model | Fine-Tuning | Continued Pre-Training | Reinforcement | Distillation |
|---|---|---|---|---|
| Claude 3.5 Sonnet | ✅ | ✅ | ✅ (2025) | ✅ (teacher) |
| Claude 3 Haiku | ✅ | ✅ | ✅ (2025) | ✅ (student) |
| Claude 3 Opus | ✅ | ✅ | ✅ (2025) | ✅ (teacher) |
| Titan Text G1 | ✅ | ✅ | ❌ | ✅ |
| Titan Text Lite | ✅ | ✅ | ❌ | ✅ (student) |
| Titan Embeddings | ✅ | ✅ | ❌ | ❌ |
| Cohere Command | ✅ | ✅ | ✅ | ✅ |
| AI21 Jurassic-2 | ✅ | ✅ | ❌ | ✅ |
Note: Availability varies by region. Check AWS Console for latest model support.
| 模型 | 微调 | 持续预训练 | 强化微调 | 蒸馏 |
|---|---|---|---|---|
| Claude 3.5 Sonnet | ✅ | ✅ | ✅ (2025) | ✅ (教师模型) |
| Claude 3 Haiku | ✅ | ✅ | ✅ (2025) | ✅ (学生模型) |
| Claude 3 Opus | ✅ | ✅ | ✅ (2025) | ✅ (教师模型) |
| Titan Text G1 | ✅ | ✅ | ❌ | ✅ |
| Titan Text Lite | ✅ | ✅ | ❌ | ✅ (学生模型) |
| Titan Embeddings | ✅ | ✅ | ❌ | ❌ |
| Cohere Command | ✅ | ✅ | ✅ | ✅ |
| AI21 Jurassic-2 | ✅ | ✅ | ❌ | ✅ |
注意:支持情况因区域而异,请查看AWS控制台获取最新的模型支持信息。
Training Data Formats
训练数据格式
Fine-Tuning Format (JSONL)
微调格式(JSONL)
jsonl
{"prompt": "Classify the medical condition: Patient presents with fever, cough, and fatigue.", "completion": "Likely viral infection. Recommend rest, hydration, and symptomatic treatment."}
{"prompt": "Classify the medical condition: Patient has chest pain, shortness of breath, and dizziness.", "completion": "Potential cardiac event. Immediate emergency evaluation required."}
{"prompt": "Classify the medical condition: Patient reports persistent headache and light sensitivity.", "completion": "Possible migraine. Consider neurological consultation if symptoms persist."}Requirements:
- Minimum 32 examples (recommended: 1000+)
- Maximum 10,000 examples per job
- Each example: prompt + completion
- JSONL format (one JSON object per line)
- Max 32K tokens per example
jsonl
{"prompt": "Classify the medical condition: Patient presents with fever, cough, and fatigue.", "completion": "Likely viral infection. Recommend rest, hydration, and symptomatic treatment."}
{"prompt": "Classify the medical condition: Patient has chest pain, shortness of breath, and dizziness.", "completion": "Potential cardiac event. Immediate emergency evaluation required."}
{"prompt": "Classify the medical condition: Patient reports persistent headache and light sensitivity.", "completion": "Possible migraine. Consider neurological consultation if symptoms persist."}要求:
- 最少32个示例(推荐1000+)
- 每个任务最多10,000个示例
- 每个示例包含prompt + completion
- JSONL格式(每行一个JSON对象)
- 每个示例最多32K tokens
Continued Pre-Training Format (JSONL)
持续预训练格式(JSONL)
jsonl
{"text": "The HIPAA Privacy Rule establishes national standards for protecting individuals' medical records and personal health information. Covered entities must implement safeguards to ensure confidentiality."}
{"text": "Electronic health records (EHR) systems integrate patient data from multiple sources, enabling comprehensive care coordination. Interoperability standards like HL7 FHIR facilitate data exchange."}
{"text": "Clinical decision support systems (CDSS) analyze patient data to provide evidence-based recommendations. Integration with EHR workflows improves diagnostic accuracy and treatment outcomes."}Requirements:
- Minimum 1000 examples (recommended: 10,000+)
- Maximum 100,000 examples per job
- Unlabeled text only
- JSONL format
- Max 32K tokens per document
jsonl
{"text": "The HIPAA Privacy Rule establishes national standards for protecting individuals' medical records and personal health information. Covered entities must implement safeguards to ensure confidentiality."}
{"text": "Electronic health records (EHR) systems integrate patient data from multiple sources, enabling comprehensive care coordination. Interoperability standards like HL7 FHIR facilitate data exchange."}
{"text": "Clinical decision support systems (CDSS) analyze patient data to provide evidence-based recommendations. Integration with EHR workflows improves diagnostic accuracy and treatment outcomes."}要求:
- 最少1000个示例(推荐10,000+)
- 每个任务最多100,000个示例
- 仅包含未标注文本
- JSONL格式
- 每个文档最多32K tokens
Reinforcement Fine-Tuning Format (JSONL)
强化微调格式(JSONL)
jsonl
{"prompt": "Explain type 2 diabetes to a patient.", "chosen": "Type 2 diabetes is a condition where your body doesn't use insulin properly. This causes high blood sugar. Managing it involves healthy eating, exercise, and sometimes medication.", "rejected": "Type 2 diabetes mellitus is characterized by insulin resistance and relative insulin deficiency leading to hyperglycemia."}
{"prompt": "What should I do if I miss a dose?", "chosen": "If you miss a dose, take it as soon as you remember. If it's almost time for your next dose, skip the missed one. Don't double up. Call your doctor if you have questions.", "rejected": "Consult the prescribing information or contact your healthcare provider immediately."}Requirements:
- Minimum 100 preference pairs (recommended: 1000+)
- Each example: prompt + chosen response + rejected response
- JSONL format
- Max 32K tokens per example
- Ranking score optional (0.0-1.0)
jsonl
{"prompt": "Explain type 2 diabetes to a patient.", "chosen": "Type 2 diabetes is a condition where your body doesn't use insulin properly. This causes high blood sugar. Managing it involves healthy eating, exercise, and sometimes medication.", "rejected": "Type 2 diabetes mellitus is characterized by insulin resistance and relative insulin deficiency leading to hyperglycemia."}
{"prompt": "What should I do if I miss a dose?", "chosen": "If you miss a dose, take it as soon as you remember. If it's almost time for your next dose, skip the missed one. Don't double up. Call your doctor if you have questions.", "rejected": "Consult the prescribing information or contact your healthcare provider immediately."}要求:
- 最少100个偏好对(推荐1000+)
- 每个示例包含prompt + chosen response + rejected response
- JSONL格式
- 每个示例最多32K tokens
- 可选排名分数(0.0-1.0)
Distillation Format (No Training Data Required)
蒸馏格式(无需训练数据)
Distillation uses the teacher model's outputs automatically:
python
undefined蒸馏自动使用教师模型的输出:
python
undefinedConfiguration only - no training data needed
Configuration only - no training data needed
distillation_config = {
'teacherModelId': 'anthropic.claude-3-5-sonnet-20241022-v2:0',
'studentModelId': 'anthropic.claude-3-haiku-20240307-v1:0',
'distillationDataSource': {
'promptDataset': {
's3Uri': 's3://bucket/prompts.jsonl' # Just prompts, no completions
}
}
}
**Prompt Dataset Format**:
```jsonl
{"prompt": "Explain the water cycle."}
{"prompt": "What are the symptoms of the flu?"}
{"prompt": "Describe photosynthesis."}Requirements:
- Minimum 1000 prompts (recommended: 10,000+)
- Teacher model generates completions automatically
- Student model trained to match teacher outputs
distillation_config = {
'teacherModelId': 'anthropic.claude-3-5-sonnet-20241022-v2:0',
'studentModelId': 'anthropic.claude-3-haiku-20240307-v1:0',
'distillationDataSource': {
'promptDataset': {
's3Uri': 's3://bucket/prompts.jsonl' # Just prompts, no completions
}
}
}
**提示数据集格式**:
```jsonl
{"prompt": "Explain the water cycle."}
{"prompt": "What are the symptoms of the flu?"}
{"prompt": "Describe photosynthesis."}要求:
- 最少1000个提示(推荐10,000+)
- 教师模型自动生成补全内容
- 学生模型被训练以匹配教师模型的输出
Quick Start
快速入门
1. Prepare Training Data
1. 准备训练数据
python
import jsonpython
import jsonFine-tuning examples
Fine-tuning examples
training_data = [
{
"prompt": "Classify sentiment: This product exceeded my expectations!",
"completion": "Positive"
},
{
"prompt": "Classify sentiment: Terrible customer service, very disappointed.",
"completion": "Negative"
},
{
"prompt": "Classify sentiment: The item was okay, nothing special.",
"completion": "Neutral"
}
]
training_data = [
{
"prompt": "Classify sentiment: This product exceeded my expectations!",
"completion": "Positive"
},
{
"prompt": "Classify sentiment: Terrible customer service, very disappointed.",
"completion": "Negative"
},
{
"prompt": "Classify sentiment: The item was okay, nothing special.",
"completion": "Neutral"
}
]
Save as JSONL
Save as JSONL
with open('training_data.jsonl', 'w') as f:
for example in training_data:
f.write(json.dumps(example) + '\n')
undefinedwith open('training_data.jsonl', 'w') as f:
for example in training_data:
f.write(json.dumps(example) + '\n')
undefined2. Upload to S3
2. 上传至S3
python
import boto3
s3 = boto3.client('s3')
bucket_name = 'my-bedrock-training-bucket'python
import boto3
s3 = boto3.client('s3')
bucket_name = 'my-bedrock-training-bucket'Upload training data
Upload training data
s3.upload_file('training_data.jsonl', bucket_name, 'fine-tuning/training_data.jsonl')
s3.upload_file('training_data.jsonl', bucket_name, 'fine-tuning/training_data.jsonl')
Upload validation data (optional but recommended)
Upload validation data (optional but recommended)
s3.upload_file('validation_data.jsonl', bucket_name, 'fine-tuning/validation_data.jsonl')
undefineds3.upload_file('validation_data.jsonl', bucket_name, 'fine-tuning/validation_data.jsonl')
undefined3. Create Customization Job
3. 创建定制任务
python
bedrock = boto3.client('bedrock')
response = bedrock.create_model_customization_job(
jobName='sentiment-classifier-v1',
customModelName='sentiment-classifier',
roleArn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
baseModelIdentifier='anthropic.claude-3-haiku-20240307-v1:0',
trainingDataConfig={
's3Uri': f's3://{bucket_name}/fine-tuning/training_data.jsonl'
},
validationDataConfig={
's3Uri': f's3://{bucket_name}/fine-tuning/validation_data.jsonl'
},
outputDataConfig={
's3Uri': f's3://{bucket_name}/fine-tuning/output/'
},
hyperParameters={
'epochCount': '3',
'batchSize': '8',
'learningRate': '0.00001'
}
)
job_arn = response['jobArn']
print(f"Customization job created: {job_arn}")python
bedrock = boto3.client('bedrock')
response = bedrock.create_model_customization_job(
jobName='sentiment-classifier-v1',
customModelName='sentiment-classifier',
roleArn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
baseModelIdentifier='anthropic.claude-3-haiku-20240307-v1:0',
trainingDataConfig={
's3Uri': f's3://{bucket_name}/fine-tuning/training_data.jsonl'
},
validationDataConfig={
's3Uri': f's3://{bucket_name}/fine-tuning/validation_data.jsonl'
},
outputDataConfig={
's3Uri': f's3://{bucket_name}/fine-tuning/output/'
},
hyperParameters={
'epochCount': '3',
'batchSize': '8',
'learningRate': '0.00001'
}
)
job_arn = response['jobArn']
print(f"Customization job created: {job_arn}")4. Monitor Training
4. 监控训练
python
undefinedpython
undefinedCheck job status
Check job status
response = bedrock.get_model_customization_job(jobIdentifier=job_arn)
status = response['status'] # InProgress, Completed, Failed, Stopped
print(f"Job status: {status}")
if status == 'Completed':
custom_model_arn = response['outputModelArn']
print(f"Custom model ARN: {custom_model_arn}")
undefinedresponse = bedrock.get_model_customization_job(jobIdentifier=job_arn)
status = response['status'] # InProgress, Completed, Failed, Stopped
print(f"Job status: {status}")
if status == 'Completed':
custom_model_arn = response['outputModelArn']
print(f"Custom model ARN: {custom_model_arn}")
undefined5. Deploy and Test
5. 部署与测试
python
bedrock_runtime = boto3.client('bedrock-runtime')python
bedrock_runtime = boto3.client('bedrock-runtime')Use custom model
Use custom model
response = bedrock_runtime.invoke_model(
modelId=custom_model_arn,
body=json.dumps({
"prompt": "Classify sentiment: I love this product!",
"max_tokens": 50
})
)
result = json.loads(response['body'].read())
print(f"Prediction: {result['completion']}")
undefinedresponse = bedrock_runtime.invoke_model(
modelId=custom_model_arn,
body=json.dumps({
"prompt": "Classify sentiment: I love this product!",
"max_tokens": 50
})
)
result = json.loads(response['body'].read())
print(f"Prediction: {result['completion']}")
undefinedOperations
操作指南
create-fine-tuning-job
create-fine-tuning-job
Create a supervised fine-tuning job with labeled examples.
python
import boto3
import json
def create_fine_tuning_job(
job_name: str,
model_name: str,
base_model_id: str,
training_s3_uri: str,
output_s3_uri: str,
role_arn: str,
validation_s3_uri: str = None,
hyper_params: dict = None
) -> str:
"""
Create fine-tuning job for task-specific adaptation.
Args:
job_name: Unique job identifier
model_name: Name for custom model
base_model_id: Base model ARN (e.g., Claude 3 Haiku)
training_s3_uri: S3 path to training JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role with Bedrock + S3 permissions
validation_s3_uri: Optional validation dataset
hyper_params: Training hyperparameters
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
# Default hyperparameters
if hyper_params is None:
hyper_params = {
'epochCount': '3', # Number of training epochs
'batchSize': '8', # Batch size (4, 8, 16, 32)
'learningRate': '0.00001', # Learning rate (0.00001 - 0.0001)
'learningRateWarmupSteps': '0'
}
# Build configuration
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': base_model_id,
'trainingDataConfig': {
's3Uri': training_s3_uri
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': hyper_params,
'customizationType': 'FINE_TUNING'
}
# Add validation data if provided
if validation_s3_uri:
config['validationDataConfig'] = {
's3Uri': validation_s3_uri
}
# Create job
response = bedrock.create_model_customization_job(**config)
print(f"Fine-tuning job created: {response['jobArn']}")
return response['jobArn']使用标注示例创建监督微调任务。
python
import boto3
import json
def create_fine_tuning_job(
job_name: str,
model_name: str,
base_model_id: str,
training_s3_uri: str,
output_s3_uri: str,
role_arn: str,
validation_s3_uri: str = None,
hyper_params: dict = None
) -> str:
"""
Create fine-tuning job for task-specific adaptation.
Args:
job_name: Unique job identifier
model_name: Name for custom model
base_model_id: Base model ARN (e.g., Claude 3 Haiku)
training_s3_uri: S3 path to training JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role with Bedrock + S3 permissions
validation_s3_uri: Optional validation dataset
hyper_params: Training hyperparameters
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
# Default hyperparameters
if hyper_params is None:
hyper_params = {
'epochCount': '3', # Number of training epochs
'batchSize': '8', # Batch size (4, 8, 16, 32)
'learningRate': '0.00001', # Learning rate (0.00001 - 0.0001)
'learningRateWarmupSteps': '0'
}
# Build configuration
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': base_model_id,
'trainingDataConfig': {
's3Uri': training_s3_uri
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': hyper_params,
'customizationType': 'FINE_TUNING'
}
# Add validation data if provided
if validation_s3_uri:
config['validationDataConfig'] = {
's3Uri': validation_s3_uri
}
# Create job
response = bedrock.create_model_customization_job(**config)
print(f"Fine-tuning job created: {response['jobArn']}")
return response['jobArn']Example: Fine-tune Claude 3 Haiku for medical classification
Example: Fine-tune Claude 3 Haiku for medical classification
job_arn = create_fine_tuning_job(
job_name='medical-classifier-v1',
model_name='medical-classifier',
base_model_id='anthropic.claude-3-haiku-20240307-v1:0',
training_s3_uri='s3://my-bucket/medical/training.jsonl',
output_s3_uri='s3://my-bucket/medical/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
validation_s3_uri='s3://my-bucket/medical/validation.jsonl',
hyper_params={
'epochCount': '5',
'batchSize': '16',
'learningRate': '0.00002'
}
)
undefinedjob_arn = create_fine_tuning_job(
job_name='medical-classifier-v1',
model_name='medical-classifier',
base_model_id='anthropic.claude-3-haiku-20240307-v1:0',
training_s3_uri='s3://my-bucket/medical/training.jsonl',
output_s3_uri='s3://my-bucket/medical/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
validation_s3_uri='s3://my-bucket/medical/validation.jsonl',
hyper_params={
'epochCount': '5',
'batchSize': '16',
'learningRate': '0.00002'
}
)
undefinedcreate-continued-pretraining-job
create-continued-pretraining-job
Create continued pre-training job for domain adaptation.
python
def create_continued_pretraining_job(
job_name: str,
model_name: str,
base_model_id: str,
training_s3_uri: str,
output_s3_uri: str,
role_arn: str,
validation_s3_uri: str = None
) -> str:
"""
Create continued pre-training job for domain knowledge.
Args:
job_name: Unique job identifier
model_name: Name for custom model
base_model_id: Base model ARN
training_s3_uri: S3 path to unlabeled text JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role ARN
validation_s3_uri: Optional validation dataset
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': base_model_id,
'trainingDataConfig': {
's3Uri': training_s3_uri
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': {
'epochCount': '1', # Usually 1 epoch for continued pre-training
'batchSize': '16',
'learningRate': '0.000005' # Lower LR for stability
},
'customizationType': 'CONTINUED_PRE_TRAINING'
}
if validation_s3_uri:
config['validationDataConfig'] = {
's3Uri': validation_s3_uri
}
response = bedrock.create_model_customization_job(**config)
print(f"Continued pre-training job created: {response['jobArn']}")
return response['jobArn']创建持续预训练任务以实现领域适配。
python
def create_continued_pretraining_job(
job_name: str,
model_name: str,
base_model_id: str,
training_s3_uri: str,
output_s3_uri: str,
role_arn: str,
validation_s3_uri: str = None
) -> str:
"""
Create continued pre-training job for domain knowledge.
Args:
job_name: Unique job identifier
model_name: Name for custom model
base_model_id: Base model ARN
training_s3_uri: S3 path to unlabeled text JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role ARN
validation_s3_uri: Optional validation dataset
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': base_model_id,
'trainingDataConfig': {
's3Uri': training_s3_uri
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': {
'epochCount': '1', # Usually 1 epoch for continued pre-training
'batchSize': '16',
'learningRate': '0.000005' # Lower LR for stability
},
'customizationType': 'CONTINUED_PRE_TRAINING'
}
if validation_s3_uri:
config['validationDataConfig'] = {
's3Uri': validation_s3_uri
}
response = bedrock.create_model_customization_job(**config)
print(f"Continued pre-training job created: {response['jobArn']}")
return response['jobArn']Example: Adapt Claude for medical domain
Example: Adapt Claude for medical domain
job_arn = create_continued_pretraining_job(
job_name='medical-domain-adapter-v1',
model_name='claude-medical',
base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0',
training_s3_uri='s3://my-bucket/medical-corpus/documents.jsonl',
output_s3_uri='s3://my-bucket/medical-corpus/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole'
)
undefinedjob_arn = create_continued_pretraining_job(
job_name='medical-domain-adapter-v1',
model_name='claude-medical',
base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0',
training_s3_uri='s3://my-bucket/medical-corpus/documents.jsonl',
output_s3_uri='s3://my-bucket/medical-corpus/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole'
)
undefinedcreate-reinforcement-finetuning-job
create-reinforcement-finetuning-job
Create reinforcement fine-tuning job with preference data (NEW 2025).
python
def create_reinforcement_finetuning_job(
job_name: str,
model_name: str,
base_model_id: str,
preference_s3_uri: str,
output_s3_uri: str,
role_arn: str,
algorithm: str = 'DPO' # DPO, PPO, or RLAIF
) -> str:
"""
Create reinforcement fine-tuning job for alignment (NEW 2025).
Args:
job_name: Unique job identifier
model_name: Name for custom model
base_model_id: Base model ARN
preference_s3_uri: S3 path to preference pairs JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role ARN
algorithm: RL algorithm (DPO, PPO, RLAIF)
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': base_model_id,
'trainingDataConfig': {
's3Uri': preference_s3_uri
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': {
'epochCount': '3',
'batchSize': '8',
'learningRate': '0.00001',
'rlAlgorithm': algorithm,
'beta': '0.1' # KL divergence coefficient
},
'customizationType': 'REINFORCEMENT_FINE_TUNING'
}
response = bedrock.create_model_customization_job(**config)
print(f"Reinforcement fine-tuning job created: {response['jobArn']}")
print(f"Expected accuracy gains: 40-66% improvement")
return response['jobArn']使用偏好数据创建强化微调任务(2025年新功能)。
python
def create_reinforcement_finetuning_job(
job_name: str,
model_name: str,
base_model_id: str,
preference_s3_uri: str,
output_s3_uri: str,
role_arn: str,
algorithm: str = 'DPO' # DPO, PPO, or RLAIF
) -> str:
"""
Create reinforcement fine-tuning job for alignment (NEW 2025).
Args:
job_name: Unique job identifier
model_name: Name for custom model
base_model_id: Base model ARN
preference_s3_uri: S3 path to preference pairs JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role ARN
algorithm: RL algorithm (DPO, PPO, RLAIF)
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': base_model_id,
'trainingDataConfig': {
's3Uri': preference_s3_uri
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': {
'epochCount': '3',
'batchSize': '8',
'learningRate': '0.00001',
'rlAlgorithm': algorithm,
'beta': '0.1' # KL divergence coefficient
},
'customizationType': 'REINFORCEMENT_FINE_TUNING'
}
response = bedrock.create_model_customization_job(**config)
print(f"Reinforcement fine-tuning job created: {response['jobArn']}")
print(f"Expected accuracy gains: 40-66% improvement")
return response['jobArn']Example: Improve response quality with preference learning
Example: Improve response quality with preference learning
job_arn = create_reinforcement_finetuning_job(
job_name='claude-aligned-v1',
model_name='claude-aligned',
base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0',
preference_s3_uri='s3://my-bucket/preferences/pairs.jsonl',
output_s3_uri='s3://my-bucket/preferences/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
algorithm='DPO' # Direct Preference Optimization
)
undefinedjob_arn = create_reinforcement_finetuning_job(
job_name='claude-aligned-v1',
model_name='claude-aligned',
base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0',
preference_s3_uri='s3://my-bucket/preferences/pairs.jsonl',
output_s3_uri='s3://my-bucket/preferences/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
algorithm='DPO' # Direct Preference Optimization
)
undefinedcreate-distillation-job
create-distillation-job
Create distillation job to transfer knowledge from large to small model.
python
def create_distillation_job(
job_name: str,
model_name: str,
teacher_model_id: str,
student_model_id: str,
prompts_s3_uri: str,
output_s3_uri: str,
role_arn: str
) -> str:
"""
Create distillation job to compress large model knowledge.
Args:
job_name: Unique job identifier
model_name: Name for distilled model
teacher_model_id: Large model to learn from
student_model_id: Small model to train
prompts_s3_uri: S3 path to prompts JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role ARN
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': student_model_id,
'trainingDataConfig': {
's3Uri': prompts_s3_uri,
'teacherModelIdentifier': teacher_model_id
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': {
'epochCount': '3',
'batchSize': '16',
'learningRate': '0.00002',
'temperature': '1.0', # Softmax temperature for distillation
'alpha': '0.5' # Balance between hard and soft targets
},
'customizationType': 'DISTILLATION'
}
response = bedrock.create_model_customization_job(**config)
print(f"Distillation job created: {response['jobArn']}")
print(f"Teacher: {teacher_model_id}")
print(f"Student: {student_model_id}")
print(f"Expected: 80-90% teacher quality at 50-70% cost")
return response['jobArn']创建蒸馏任务,将大模型的知识迁移到小模型中。
python
def create_distillation_job(
job_name: str,
model_name: str,
teacher_model_id: str,
student_model_id: str,
prompts_s3_uri: str,
output_s3_uri: str,
role_arn: str
) -> str:
"""
Create distillation job to compress large model knowledge.
Args:
job_name: Unique job identifier
model_name: Name for distilled model
teacher_model_id: Large model to learn from
student_model_id: Small model to train
prompts_s3_uri: S3 path to prompts JSONL
output_s3_uri: S3 path for outputs
role_arn: IAM role ARN
Returns:
Job ARN for monitoring
"""
bedrock = boto3.client('bedrock')
config = {
'jobName': job_name,
'customModelName': model_name,
'roleArn': role_arn,
'baseModelIdentifier': student_model_id,
'trainingDataConfig': {
's3Uri': prompts_s3_uri,
'teacherModelIdentifier': teacher_model_id
},
'outputDataConfig': {
's3Uri': output_s3_uri
},
'hyperParameters': {
'epochCount': '3',
'batchSize': '16',
'learningRate': '0.00002',
'temperature': '1.0', # Softmax temperature for distillation
'alpha': '0.5' # Balance between hard and soft targets
},
'customizationType': 'DISTILLATION'
}
response = bedrock.create_model_customization_job(**config)
print(f"Distillation job created: {response['jobArn']}")
print(f"Teacher: {teacher_model_id}")
print(f"Student: {student_model_id}")
print(f"Expected: 80-90% teacher quality at 50-70% cost")
return response['jobArn']Example: Distill Claude 3.5 Sonnet to Haiku
Example: Distill Claude 3.5 Sonnet to Haiku
job_arn = create_distillation_job(
job_name='claude-haiku-distilled-v1',
model_name='claude-haiku-distilled',
teacher_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0',
student_model_id='anthropic.claude-3-haiku-20240307-v1:0',
prompts_s3_uri='s3://my-bucket/distillation/prompts.jsonl',
output_s3_uri='s3://my-bucket/distillation/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole'
)
undefinedjob_arn = create_distillation_job(
job_name='claude-haiku-distilled-v1',
model_name='claude-haiku-distilled',
teacher_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0',
student_model_id='anthropic.claude-3-haiku-20240307-v1:0',
prompts_s3_uri='s3://my-bucket/distillation/prompts.jsonl',
output_s3_uri='s3://my-bucket/distillation/output/',
role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole'
)
undefinedmonitor-job
monitor-job
Track training progress and retrieve metrics.
python
import time
from typing import Dict, Any
def monitor_job(job_arn: str, poll_interval: int = 60) -> Dict[str, Any]:
"""
Monitor customization job until completion.
Args:
job_arn: Job ARN to monitor
poll_interval: Seconds between status checks
Returns:
Final job details with metrics
"""
bedrock = boto3.client('bedrock')
print(f"Monitoring job: {job_arn}")
while True:
response = bedrock.get_model_customization_job(
jobIdentifier=job_arn
)
status = response['status']
print(f"Status: {status}", end='')
# Show metrics if available
if 'trainingMetrics' in response:
metrics = response['trainingMetrics']
if 'trainingLoss' in metrics:
print(f" | Loss: {metrics['trainingLoss']:.4f}", end='')
print() # Newline
# Check terminal states
if status == 'Completed':
print(f"Job completed successfully!")
print(f"Custom model ARN: {response['outputModelArn']}")
return response
elif status == 'Failed':
print(f"Job failed: {response.get('failureMessage', 'Unknown error')}")
return response
elif status == 'Stopped':
print(f"Job was stopped")
return response
# Wait before next check
time.sleep(poll_interval)跟踪训练进度并获取指标。
python
import time
from typing import Dict, Any
def monitor_job(job_arn: str, poll_interval: int = 60) -> Dict[str, Any]:
"""
Monitor customization job until completion.
Args:
job_arn: Job ARN to monitor
poll_interval: Seconds between status checks
Returns:
Final job details with metrics
"""
bedrock = boto3.client('bedrock')
print(f"Monitoring job: {job_arn}")
while True:
response = bedrock.get_model_customization_job(
jobIdentifier=job_arn
)
status = response['status']
print(f"Status: {status}", end='')
# Show metrics if available
if 'trainingMetrics' in response:
metrics = response['trainingMetrics']
if 'trainingLoss' in metrics:
print(f" | Loss: {metrics['trainingLoss']:.4f}", end='')
print() # Newline
# Check terminal states
if status == 'Completed':
print(f"Job completed successfully!")
print(f"Custom model ARN: {response['outputModelArn']}")
return response
elif status == 'Failed':
print(f"Job failed: {response.get('failureMessage', 'Unknown error')}")
return response
elif status == 'Stopped':
print(f"Job was stopped")
return response
# Wait before next check
time.sleep(poll_interval)Example: Monitor with automatic polling
Example: Monitor with automatic polling
job_details = monitor_job(job_arn, poll_interval=60)
if job_details['status'] == 'Completed':
custom_model_arn = job_details['outputModelArn']
# Download metrics from S3
output_uri = job_details['outputDataConfig']['s3Uri']
print(f"Metrics available at: {output_uri}")undefinedjob_details = monitor_job(job_arn, poll_interval=60)
if job_details['status'] == 'Completed':
custom_model_arn = job_details['outputModelArn']
# Download metrics from S3
output_uri = job_details['outputDataConfig']['s3Uri']
print(f"Metrics available at: {output_uri}")undefineddeploy-custom-model
deploy-custom-model
Provision custom model for inference.
python
def deploy_custom_model(
model_arn: str,
provisioned_model_name: str,
model_units: int = 1
) -> str:
"""
Deploy custom model with provisioned throughput.
Args:
model_arn: Custom model ARN from training job
provisioned_model_name: Name for provisioned model
model_units: Throughput units (1-10)
Returns:
Provisioned model ARN for inference
"""
bedrock = boto3.client('bedrock')
response = bedrock.create_provisioned_model_throughput(
provisionedModelName=provisioned_model_name,
modelId=model_arn,
modelUnits=model_units
)
provisioned_arn = response['provisionedModelArn']
print(f"Provisioned model created: {provisioned_arn}")
print(f"Throughput: {model_units} units")
print(f"Allow 5-10 minutes for provisioning")
return provisioned_arn部署定制模型以用于推理。
python
def deploy_custom_model(
model_arn: str,
provisioned_model_name: str,
model_units: int = 1
) -> str:
"""
Deploy custom model with provisioned throughput.
Args:
model_arn: Custom model ARN from training job
provisioned_model_name: Name for provisioned model
model_units: Throughput units (1-10)
Returns:
Provisioned model ARN for inference
"""
bedrock = boto3.client('bedrock')
response = bedrock.create_provisioned_model_throughput(
provisionedModelName=provisioned_model_name,
modelId=model_arn,
modelUnits=model_units
)
provisioned_arn = response['provisionedModelArn']
print(f"Provisioned model created: {provisioned_arn}")
print(f"Throughput: {model_units} units")
print(f"Allow 5-10 minutes for provisioning")
return provisioned_arnExample: Deploy with standard throughput
Example: Deploy with standard throughput
provisioned_arn = deploy_custom_model(
model_arn='arn:aws:bedrock:us-east-1:123456789012:custom-model/medical-classifier-v1',
provisioned_model_name='medical-classifier-prod',
model_units=2
)
provisioned_arn = deploy_custom_model(
model_arn='arn:aws:bedrock:us-east-1:123456789012:custom-model/medical-classifier-v1',
provisioned_model_name='medical-classifier-prod',
model_units=2
)
Wait for provisioning
Wait for provisioning
time.sleep(300) # 5 minutes
time.sleep(300) # 5 minutes
Use provisioned model
Use provisioned model
bedrock_runtime = boto3.client('bedrock-runtime')
response = bedrock_runtime.invoke_model(
modelId=provisioned_arn,
body=json.dumps({
"prompt": "Classify: Patient has fever and cough.",
"max_tokens": 100
})
)
result = json.loads(response['body'].read())
print(f"Prediction: {result['completion']}")
undefinedbedrock_runtime = boto3.client('bedrock-runtime')
response = bedrock_runtime.invoke_model(
modelId=provisioned_arn,
body=json.dumps({
"prompt": "Classify: Patient has fever and cough.",
"max_tokens": 100
})
)
result = json.loads(response['body'].read())
print(f"Prediction: {result['completion']}")
undefinedevaluate-model
evaluate-model
Test custom model performance with evaluation dataset.
python
import pandas as pd
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
def evaluate_model(
model_id: str,
test_data_path: str,
output_path: str = None
) -> Dict[str, float]:
"""
Evaluate custom model on test dataset.
Args:
model_id: Custom model ARN
test_data_path: Path to test JSONL file
output_path: Optional path to save predictions
Returns:
Evaluation metrics dictionary
"""
bedrock_runtime = boto3.client('bedrock-runtime')
# Load test data
test_data = []
with open(test_data_path, 'r') as f:
for line in f:
test_data.append(json.loads(line))
# Run predictions
predictions = []
ground_truth = []
print(f"Evaluating {len(test_data)} examples...")
for i, example in enumerate(test_data):
if i % 10 == 0:
print(f"Progress: {i}/{len(test_data)}")
# Invoke model
response = bedrock_runtime.invoke_model(
modelId=model_id,
body=json.dumps({
"prompt": example['prompt'],
"max_tokens": 200
})
)
result = json.loads(response['body'].read())
prediction = result['completion'].strip()
predictions.append(prediction)
ground_truth.append(example['completion'].strip())
# Calculate metrics
accuracy = accuracy_score(ground_truth, predictions)
precision, recall, f1, _ = precision_recall_fscore_support(
ground_truth, predictions, average='weighted', zero_division=0
)
metrics = {
'accuracy': accuracy,
'precision': precision,
'recall': recall,
'f1_score': f1,
'total_examples': len(test_data)
}
print("\n=== Evaluation Results ===")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
# Save predictions if requested
if output_path:
results_df = pd.DataFrame({
'prompt': [ex['prompt'] for ex in test_data],
'ground_truth': ground_truth,
'prediction': predictions
})
results_df.to_csv(output_path, index=False)
print(f"Predictions saved to: {output_path}")
return metrics使用评估数据集测试定制模型的性能。
python
import pandas as pd
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
def evaluate_model(
model_id: str,
test_data_path: str,
output_path: str = None
) -> Dict[str, float]:
"""
Evaluate custom model on test dataset.
Args:
model_id: Custom model ARN
test_data_path: Path to test JSONL file
output_path: Optional path to save predictions
Returns:
Evaluation metrics dictionary
"""
bedrock_runtime = boto3.client('bedrock-runtime')
# Load test data
test_data = []
with open(test_data_path, 'r') as f:
for line in f:
test_data.append(json.loads(line))
# Run predictions
predictions = []
ground_truth = []
print(f"Evaluating {len(test_data)} examples...")
for i, example in enumerate(test_data):
if i % 10 == 0:
print(f"Progress: {i}/{len(test_data)}")
# Invoke model
response = bedrock_runtime.invoke_model(
modelId=model_id,
body=json.dumps({
"prompt": example['prompt'],
"max_tokens": 200
})
)
result = json.loads(response['body'].read())
prediction = result['completion'].strip()
predictions.append(prediction)
ground_truth.append(example['completion'].strip())
# Calculate metrics
accuracy = accuracy_score(ground_truth, predictions)
precision, recall, f1, _ = precision_recall_fscore_support(
ground_truth, predictions, average='weighted', zero_division=0
)
metrics = {
'accuracy': accuracy,
'precision': precision,
'recall': recall,
'f1_score': f1,
'total_examples': len(test_data)
}
print("\n=== Evaluation Results ===")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
# Save predictions if requested
if output_path:
results_df = pd.DataFrame({
'prompt': [ex['prompt'] for ex in test_data],
'ground_truth': ground_truth,
'prediction': predictions
})
results_df.to_csv(output_path, index=False)
print(f"Predictions saved to: {output_path}")
return metricsExample: Evaluate medical classifier
Example: Evaluate medical classifier
metrics = evaluate_model(
model_id='arn:aws:bedrock:us-east-1:123456789012:provisioned-model/medical-classifier-prod',
test_data_path='test_data.jsonl',
output_path='evaluation_results.csv'
)
undefinedmetrics = evaluate_model(
model_id='arn:aws:bedrock:us-east-1:123456789012:provisioned-model/medical-classifier-prod',
test_data_path='test_data.jsonl',
output_path='evaluation_results.csv'
)
undefinedHyperparameter Tuning
超参数调优
Fine-Tuning Parameters
微调参数
| Parameter | Range | Default | Description |
|---|---|---|---|
| epochCount | 1-10 | 3 | Training passes over dataset |
| batchSize | 4-32 | 8 | Examples per training step |
| learningRate | 0.00001-0.0001 | 0.00001 | Step size for weight updates |
| learningRateWarmupSteps | 0-100 | 0 | Gradual LR increase steps |
Tuning Guidelines:
- Small dataset (<100 examples): Lower epochs (1-2), smaller batch (4-8)
- Medium dataset (100-1000): Standard settings (3 epochs, batch 8-16)
- Large dataset (>1000): Higher epochs (5-10), larger batch (16-32)
- Overfitting signs: Reduce epochs or increase batch size
- Underfitting signs: Increase epochs or decrease learning rate
| 参数 | 范围 | 默认值 | 描述 |
|---|---|---|---|
| epochCount | 1-10 | 3 | 训练遍历数据集的次数 |
| batchSize | 4-32 | 8 | 每个训练步骤的示例数量 |
| learningRate | 0.00001-0.0001 | 0.00001 | 权重更新的步长 |
| learningRateWarmupSteps | 0-100 | 0 | 学习率逐步提升的步数 |
调优指南:
- 小数据集(<100个示例):减少训练轮数(1-2),减小批次大小(4-8)
- 中等数据集(100-1000个示例):使用标准设置(3轮训练,批次8-16)
- 大数据集(>1000个示例):增加训练轮数(5-10),增大批次大小(16-32)
- 过拟合迹象:减少训练轮数或增大批次大小
- 欠拟合迹象:增加训练轮数或降低学习率
Example Configurations
示例配置
python
undefinedpython
undefinedConfiguration 1: Small dataset, quick iteration
Configuration 1: Small dataset, quick iteration
small_dataset_params = {
'epochCount': '2',
'batchSize': '4',
'learningRate': '0.00002',
'learningRateWarmupSteps': '10'
}
small_dataset_params = {
'epochCount': '2',
'batchSize': '4',
'learningRate': '0.00002',
'learningRateWarmupSteps': '10'
}
Configuration 2: Balanced, general purpose
Configuration 2: Balanced, general purpose
balanced_params = {
'epochCount': '3',
'batchSize': '8',
'learningRate': '0.00001',
'learningRateWarmupSteps': '0'
}
balanced_params = {
'epochCount': '3',
'batchSize': '8',
'learningRate': '0.00001',
'learningRateWarmupSteps': '0'
}
Configuration 3: Large dataset, high quality
Configuration 3: Large dataset, high quality
large_dataset_params = {
'epochCount': '5',
'batchSize': '16',
'learningRate': '0.000005',
'learningRateWarmupSteps': '20'
}
large_dataset_params = {
'epochCount': '5',
'batchSize': '16',
'learningRate': '0.000005',
'learningRateWarmupSteps': '20'
}
Configuration 4: Continued pre-training
Configuration 4: Continued pre-training
pretraining_params = {
'epochCount': '1',
'batchSize': '16',
'learningRate': '0.000005',
'learningRateWarmupSteps': '0'
}
undefinedpretraining_params = {
'epochCount': '1',
'batchSize': '16',
'learningRate': '0.000005',
'learningRateWarmupSteps': '0'
}
undefinedData Preparation Best Practices
数据准备最佳实践
1. Data Quality
1. 数据质量
python
def validate_training_data(data_path: str) -> bool:
"""
Validate training data quality.
Checks:
- JSONL format validity
- Required fields present
- Token length within limits
- Data distribution balance
"""
import json
from collections import Counter
issues = []
completion_distribution = Counter()
with open(data_path, 'r') as f:
for i, line in enumerate(f, 1):
try:
example = json.loads(line)
except json.JSONDecodeError:
issues.append(f"Line {i}: Invalid JSON")
continue
# Check required fields
if 'prompt' not in example:
issues.append(f"Line {i}: Missing 'prompt' field")
if 'completion' not in example:
issues.append(f"Line {i}: Missing 'completion' field")
# Track completion distribution
if 'completion' in example:
completion_distribution[example['completion']] += 1
# Check token length (approximate)
prompt_tokens = len(example.get('prompt', '').split())
completion_tokens = len(example.get('completion', '').split())
total_tokens = prompt_tokens + completion_tokens
if total_tokens > 8000: # Conservative estimate
issues.append(f"Line {i}: Likely exceeds 32K token limit")
# Report issues
if issues:
print("Data Validation Issues:")
for issue in issues[:10]: # Show first 10
print(f" - {issue}")
if len(issues) > 10:
print(f" ... and {len(issues) - 10} more issues")
return False
# Check distribution balance
print("\nCompletion Distribution:")
for completion, count in completion_distribution.most_common():
print(f" {completion}: {count}")
# Warn about imbalance
counts = list(completion_distribution.values())
if max(counts) > 3 * min(counts):
print("\nWarning: Imbalanced dataset detected")
print("Consider balancing or stratified sampling")
print("\nValidation passed!")
return Truepython
def validate_training_data(data_path: str) -> bool:
"""
Validate training data quality.
Checks:
- JSONL format validity
- Required fields present
- Token length within limits
- Data distribution balance
"""
import json
from collections import Counter
issues = []
completion_distribution = Counter()
with open(data_path, 'r') as f:
for i, line in enumerate(f, 1):
try:
example = json.loads(line)
except json.JSONDecodeError:
issues.append(f"Line {i}: Invalid JSON")
continue
# Check required fields
if 'prompt' not in example:
issues.append(f"Line {i}: Missing 'prompt' field")
if 'completion' not in example:
issues.append(f"Line {i}: Missing 'completion' field")
# Track completion distribution
if 'completion' in example:
completion_distribution[example['completion']] += 1
# Check token length (approximate)
prompt_tokens = len(example.get('prompt', '').split())
completion_tokens = len(example.get('completion', '').split())
total_tokens = prompt_tokens + completion_tokens
if total_tokens > 8000: # Conservative estimate
issues.append(f"Line {i}: Likely exceeds 32K token limit")
# Report issues
if issues:
print("Data Validation Issues:")
for issue in issues[:10]: # Show first 10
print(f" - {issue}")
if len(issues) > 10:
print(f" ... and {len(issues) - 10} more issues")
return False
# Check distribution balance
print("\nCompletion Distribution:")
for completion, count in completion_distribution.most_common():
print(f" {completion}: {count}")
# Warn about imbalance
counts = list(completion_distribution.values())
if max(counts) > 3 * min(counts):
print("\nWarning: Imbalanced dataset detected")
print("Consider balancing or stratified sampling")
print("\nValidation passed!")
return TrueExample usage
Example usage
validate_training_data('training_data.jsonl')
undefinedvalidate_training_data('training_data.jsonl')
undefined2. Data Augmentation
2. 数据增强
python
def augment_training_data(
input_path: str,
output_path: str,
augmentation_factor: int = 2
):
"""
Augment training data with paraphrasing and variations.
Args:
input_path: Original training data
output_path: Augmented output file
augmentation_factor: Multiplier for dataset size
"""
import random
# Load original data
original_data = []
with open(input_path, 'r') as f:
for line in f:
original_data.append(json.loads(line))
# Augmentation strategies
prompt_prefixes = [
"",
"Please ",
"Could you ",
"I need you to "
]
augmented_data = []
for example in original_data:
# Include original
augmented_data.append(example)
# Create variations
for _ in range(augmentation_factor - 1):
prefix = random.choice(prompt_prefixes)
augmented_example = {
'prompt': prefix + example['prompt'],
'completion': example['completion']
}
augmented_data.append(augmented_example)
# Save augmented data
with open(output_path, 'w') as f:
for example in augmented_data:
f.write(json.dumps(example) + '\n')
print(f"Augmented {len(original_data)} → {len(augmented_data)} examples")python
def augment_training_data(
input_path: str,
output_path: str,
augmentation_factor: int = 2
):
"""
Augment training data with paraphrasing and variations.
Args:
input_path: Original training data
output_path: Augmented output file
augmentation_factor: Multiplier for dataset size
"""
import random
# Load original data
original_data = []
with open(input_path, 'r') as f:
for line in f:
original_data.append(json.loads(line))
# Augmentation strategies
prompt_prefixes = [
"",
"Please ",
"Could you ",
"I need you to "
]
augmented_data = []
for example in original_data:
# Include original
augmented_data.append(example)
# Create variations
for _ in range(augmentation_factor - 1):
prefix = random.choice(prompt_prefixes)
augmented_example = {
'prompt': prefix + example['prompt'],
'completion': example['completion']
}
augmented_data.append(augmented_example)
# Save augmented data
with open(output_path, 'w') as f:
for example in augmented_data:
f.write(json.dumps(example) + '\n')
print(f"Augmented {len(original_data)} → {len(augmented_data)} examples")Example usage
Example usage
augment_training_data('training_data.jsonl', 'training_data_augmented.jsonl')
undefinedaugment_training_data('training_data.jsonl', 'training_data_augmented.jsonl')
undefined3. Train/Validation Split
3. 训练/验证集拆分
python
def split_dataset(
input_path: str,
train_path: str,
val_path: str,
val_split: float = 0.2
):
"""
Split dataset into training and validation sets.
Args:
input_path: Full dataset JSONL
train_path: Output training JSONL
val_path: Output validation JSONL
val_split: Fraction for validation (0.1-0.3)
"""
import random
# Load data
data = []
with open(input_path, 'r') as f:
for line in f:
data.append(json.loads(line))
# Shuffle
random.shuffle(data)
# Split
val_size = int(len(data) * val_split)
train_data = data[val_size:]
val_data = data[:val_size]
# Save
with open(train_path, 'w') as f:
for example in train_data:
f.write(json.dumps(example) + '\n')
with open(val_path, 'w') as f:
for example in val_data:
f.write(json.dumps(example) + '\n')
print(f"Split: {len(train_data)} training, {len(val_data)} validation")python
def split_dataset(
input_path: str,
train_path: str,
val_path: str,
val_split: float = 0.2
):
"""
Split dataset into training and validation sets.
Args:
input_path: Full dataset JSONL
train_path: Output training JSONL
val_path: Output validation JSONL
val_split: Fraction for validation (0.1-0.3)
"""
import random
# Load data
data = []
with open(input_path, 'r') as f:
for line in f:
data.append(json.loads(line))
# Shuffle
random.shuffle(data)
# Split
val_size = int(len(data) * val_split)
train_data = data[val_size:]
val_data = data[:val_size]
# Save
with open(train_path, 'w') as f:
for example in train_data:
f.write(json.dumps(example) + '\n')
with open(val_path, 'w') as f:
for example in val_data:
f.write(json.dumps(example) + '\n')
print(f"Split: {len(train_data)} training, {len(val_data)} validation")Example usage
Example usage
split_dataset('full_dataset.jsonl', 'training.jsonl', 'validation.jsonl', val_split=0.2)
undefinedsplit_dataset('full_dataset.jsonl', 'training.jsonl', 'validation.jsonl', val_split=0.2)
undefinedCost Considerations
成本考量
Training Costs
训练成本
Cost Structure:
- Fine-tuning: $0.01-0.05 per 1000 tokens processed
- Continued pre-training: $0.02-0.08 per 1000 tokens processed
- Reinforcement fine-tuning: $0.03-0.10 per 1000 tokens processed
- Distillation: $0.02-0.06 per 1000 tokens processed
Example Calculations:
python
def estimate_training_cost(
num_examples: int,
avg_tokens_per_example: int,
num_epochs: int,
cost_per_1k_tokens: float = 0.03
) -> float:
"""
Estimate training cost.
Args:
num_examples: Number of training examples
avg_tokens_per_example: Average tokens (prompt + completion)
num_epochs: Training epochs
cost_per_1k_tokens: Cost rate
Returns:
Estimated cost in USD
"""
total_tokens = num_examples * avg_tokens_per_example * num_epochs
cost = (total_tokens / 1000) * cost_per_1k_tokens
print(f"Training Examples: {num_examples:,}")
print(f"Avg Tokens/Example: {avg_tokens_per_example}")
print(f"Epochs: {num_epochs}")
print(f"Total Tokens: {total_tokens:,}")
print(f"Estimated Cost: ${cost:.2f}")
return cost成本结构:
- 微调:每处理1000个tokens花费0.01-0.05美元
- 持续预训练:每处理1000个tokens花费0.02-0.08美元
- 强化微调:每处理1000个tokens花费0.03-0.10美元
- 蒸馏:每处理1000个tokens花费0.02-0.06美元
示例计算:
python
def estimate_training_cost(
num_examples: int,
avg_tokens_per_example: int,
num_epochs: int,
cost_per_1k_tokens: float = 0.03
) -> float:
"""
Estimate training cost.
Args:
num_examples: Number of training examples
avg_tokens_per_example: Average tokens (prompt + completion)
num_epochs: Training epochs
cost_per_1k_tokens: Cost rate
Returns:
Estimated cost in USD
"""
total_tokens = num_examples * avg_tokens_per_example * num_epochs
cost = (total_tokens / 1000) * cost_per_1k_tokens
print(f"Training Examples: {num_examples:,}")
print(f"Avg Tokens/Example: {avg_tokens_per_example}")
print(f"Epochs: {num_epochs}")
print(f"Total Tokens: {total_tokens:,}")
print(f"Estimated Cost: ${cost:.2f}")
return costExample: Fine-tune with 1000 examples
Example: Fine-tune with 1000 examples
estimate_training_cost(
num_examples=1000,
avg_tokens_per_example=500,
num_epochs=3,
cost_per_1k_tokens=0.03
)
estimate_training_cost(
num_examples=1000,
avg_tokens_per_example=500,
num_epochs=3,
cost_per_1k_tokens=0.03
)
Output: ~$45
Output: ~$45
undefinedundefinedInference Costs
推理成本
Provisioned Throughput Pricing:
- Model Units: $X per hour per unit
- Cost varies by base model
- Minimum commitment: 1 month or 6 months
Cost Optimization:
python
def compare_model_costs(
requests_per_day: int,
avg_tokens_per_request: int
):
"""
Compare on-demand vs provisioned vs distilled model costs.
"""
# Base Claude 3.5 Sonnet on-demand: $3/$15 per 1M tokens
base_cost_input = (requests_per_day * avg_tokens_per_request * 30) / 1_000_000 * 3
base_cost_output = (requests_per_day * avg_tokens_per_request * 0.5 * 30) / 1_000_000 * 15
base_monthly = base_cost_input + base_cost_output
# Provisioned throughput: ~$2500/month per unit
provisioned_monthly = 2500
# Distilled to Haiku: 50% cost reduction
distilled_monthly = base_monthly * 0.5
print(f"Monthly Cost Comparison ({requests_per_day:,} requests/day):")
print(f" Base Model On-Demand: ${base_monthly:.2f}")
print(f" Provisioned (1 unit): ${provisioned_monthly:.2f}")
print(f" Distilled Model: ${distilled_monthly:.2f}")
# Breakeven analysis
if base_monthly > provisioned_monthly:
print(f"\nProvisioned throughput recommended (saves ${base_monthly - provisioned_monthly:.2f}/mo)")
else:
print(f"\nOn-demand recommended (saves ${provisioned_monthly - base_monthly:.2f}/mo)")预置吞吐量定价:
- 模型单元:每单元每小时X美元
- 成本因基础模型而异
- 最低承诺:1个月或6个月
成本优化:
python
def compare_model_costs(
requests_per_day: int,
avg_tokens_per_request: int
):
"""
Compare on-demand vs provisioned vs distilled model costs.
"""
# Base Claude 3.5 Sonnet on-demand: $3/$15 per 1M tokens
base_cost_input = (requests_per_day * avg_tokens_per_request * 30) / 1_000_000 * 3
base_cost_output = (requests_per_day * avg_tokens_per_request * 0.5 * 30) / 1_000_000 * 15
base_monthly = base_cost_input + base_cost_output
# Provisioned throughput: ~$2500/month per unit
provisioned_monthly = 2500
# Distilled to Haiku: 50% cost reduction
distilled_monthly = base_monthly * 0.5
print(f"Monthly Cost Comparison ({requests_per_day:,} requests/day):")
print(f" Base Model On-Demand: ${base_monthly:.2f}")
print(f" Provisioned (1 unit): ${provisioned_monthly:.2f}")
print(f" Distilled Model: ${distilled_monthly:.2f}")
# Breakeven analysis
if base_monthly > provisioned_monthly:
print(f"\nProvisioned throughput recommended (saves ${base_monthly - provisioned_monthly:.2f}/mo)")
else:
print(f"\nOn-demand recommended (saves ${provisioned_monthly - base_monthly:.2f}/mo)")Example comparison
Example comparison
compare_model_costs(requests_per_day=10000, avg_tokens_per_request=1000)
undefinedcompare_model_costs(requests_per_day=10000, avg_tokens_per_request=1000)
undefinedRelated Skills
相关技能
- bedrock-inference: Invoke foundation models and custom models
- bedrock-knowledge-bases: RAG with custom models
- bedrock-guardrails: Apply safety policies to custom models
- bedrock-agentcore: Build agents with custom models
- claude-cost-optimization: Optimize model selection and costs
- claude-context-management: Manage context for custom models
- boto3-ecs: Deploy custom model inference on ECS
- boto3-eks: Deploy custom model inference on EKS
- bedrock-inference:调用基础模型和定制模型
- bedrock-knowledge-bases:结合定制模型实现RAG
- bedrock-guardrails:为定制模型应用安全策略
- bedrock-agentcore:使用定制模型构建智能体
- claude-cost-optimization:优化模型选择与成本
- claude-context-management:为定制模型管理上下文
- boto3-ecs:在ECS上部署定制模型推理
- boto3-eks:在EKS上部署定制模型推理