bedrock-fine-tuning

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Amazon Bedrock Model Customization

Amazon Bedrock 模型定制

Complete guide to customizing Amazon Bedrock foundation models through fine-tuning, continued pre-training, reinforcement fine-tuning, and distillation.
本指南详细介绍如何通过微调、持续预训练、强化微调以及蒸馏技术定制Amazon Bedrock基础模型。

Overview

概述

Amazon Bedrock Model Customization allows you to adapt foundation models to your specific use cases without managing infrastructure. Four customization approaches are available:
Amazon Bedrock模型定制功能让您无需管理基础设施,即可将基础模型适配到特定使用场景。目前提供四种定制方式:

1. Fine-Tuning (Supervised Learning)

1. 微调(监督学习)

Adapt models to specific tasks using labeled examples (input-output pairs). Best for:
  • Task-specific optimization (classification, extraction, generation)
  • Improving responses for domain terminology
  • Teaching specific output formats
  • Typical gains: 20-40% accuracy improvement
使用标注示例(输入-输出对)将模型适配到特定任务。最适用于:
  • 特定任务优化(分类、提取、生成)
  • 提升领域术语响应能力
  • 教授特定输出格式
  • 典型收益:准确率提升20-40%

2. Continued Pre-Training (Domain Adaptation)

2. 持续预训练(领域适配)

Continue training on unlabeled domain-specific text to build domain knowledge. Best for:
  • Medical, legal, financial, technical domains
  • Proprietary knowledge bases
  • Industry-specific language
  • Typical gains: 15-30% domain accuracy improvement
在未标注的特定领域文本上继续训练,构建领域知识。最适用于:
  • 医疗、法律、金融、技术等领域
  • 专有知识库
  • 行业专属语言
  • 典型收益:领域准确率提升15-30%

3. Reinforcement Fine-Tuning (NEW 2025)

3. 强化微调(2025年新功能)

Use reinforcement learning with human feedback (RLHF) or AI feedback (RLAIF) for alignment. Best for:
  • Improving response quality and safety
  • Aligning to brand voice and values
  • Reducing hallucinations
  • Typical gains: 40-66% accuracy improvement (AWS announced 66% gains in 2025)
结合人类反馈强化学习(RLHF)或AI反馈强化学习(RLAIF)实现模型对齐。最适用于:
  • 提升响应质量与安全性
  • 对齐品牌语气与价值观
  • 减少幻觉现象
  • 典型收益:准确率提升40-66%(AWS在2025年宣布可实现66%的提升)

4. Distillation (Teacher-Student)

4. 蒸馏(师生模型法)

Transfer knowledge from larger models to smaller, faster models. Best for:
  • Cost optimization (smaller models are cheaper)
  • Latency reduction (faster inference)
  • Maintaining quality while reducing size
  • Typical gains: 80-90% of teacher model quality at 50-70% cost reduction
将大模型的知识迁移到更小、更快的模型中。最适用于:
  • 成本优化(小模型成本更低)
  • 降低延迟(推理速度更快)
  • 在缩小模型规模的同时保持质量
  • 典型收益:保留教师模型80-90%的质量,同时降低50-70%的成本

Supported Models

支持的模型

ModelFine-TuningContinued Pre-TrainingReinforcementDistillation
Claude 3.5 Sonnet✅ (2025)✅ (teacher)
Claude 3 Haiku✅ (2025)✅ (student)
Claude 3 Opus✅ (2025)✅ (teacher)
Titan Text G1
Titan Text Lite✅ (student)
Titan Embeddings
Cohere Command
AI21 Jurassic-2
Note: Availability varies by region. Check AWS Console for latest model support.
模型微调持续预训练强化微调蒸馏
Claude 3.5 Sonnet✅ (2025)✅ (教师模型)
Claude 3 Haiku✅ (2025)✅ (学生模型)
Claude 3 Opus✅ (2025)✅ (教师模型)
Titan Text G1
Titan Text Lite✅ (学生模型)
Titan Embeddings
Cohere Command
AI21 Jurassic-2
注意:支持情况因区域而异,请查看AWS控制台获取最新的模型支持信息。

Training Data Formats

训练数据格式

Fine-Tuning Format (JSONL)

微调格式(JSONL)

jsonl
{"prompt": "Classify the medical condition: Patient presents with fever, cough, and fatigue.", "completion": "Likely viral infection. Recommend rest, hydration, and symptomatic treatment."}
{"prompt": "Classify the medical condition: Patient has chest pain, shortness of breath, and dizziness.", "completion": "Potential cardiac event. Immediate emergency evaluation required."}
{"prompt": "Classify the medical condition: Patient reports persistent headache and light sensitivity.", "completion": "Possible migraine. Consider neurological consultation if symptoms persist."}
Requirements:
  • Minimum 32 examples (recommended: 1000+)
  • Maximum 10,000 examples per job
  • Each example: prompt + completion
  • JSONL format (one JSON object per line)
  • Max 32K tokens per example
jsonl
{"prompt": "Classify the medical condition: Patient presents with fever, cough, and fatigue.", "completion": "Likely viral infection. Recommend rest, hydration, and symptomatic treatment."}
{"prompt": "Classify the medical condition: Patient has chest pain, shortness of breath, and dizziness.", "completion": "Potential cardiac event. Immediate emergency evaluation required."}
{"prompt": "Classify the medical condition: Patient reports persistent headache and light sensitivity.", "completion": "Possible migraine. Consider neurological consultation if symptoms persist."}
要求:
  • 最少32个示例(推荐1000+)
  • 每个任务最多10,000个示例
  • 每个示例包含prompt + completion
  • JSONL格式(每行一个JSON对象)
  • 每个示例最多32K tokens

Continued Pre-Training Format (JSONL)

持续预训练格式(JSONL)

jsonl
{"text": "The HIPAA Privacy Rule establishes national standards for protecting individuals' medical records and personal health information. Covered entities must implement safeguards to ensure confidentiality."}
{"text": "Electronic health records (EHR) systems integrate patient data from multiple sources, enabling comprehensive care coordination. Interoperability standards like HL7 FHIR facilitate data exchange."}
{"text": "Clinical decision support systems (CDSS) analyze patient data to provide evidence-based recommendations. Integration with EHR workflows improves diagnostic accuracy and treatment outcomes."}
Requirements:
  • Minimum 1000 examples (recommended: 10,000+)
  • Maximum 100,000 examples per job
  • Unlabeled text only
  • JSONL format
  • Max 32K tokens per document
jsonl
{"text": "The HIPAA Privacy Rule establishes national standards for protecting individuals' medical records and personal health information. Covered entities must implement safeguards to ensure confidentiality."}
{"text": "Electronic health records (EHR) systems integrate patient data from multiple sources, enabling comprehensive care coordination. Interoperability standards like HL7 FHIR facilitate data exchange."}
{"text": "Clinical decision support systems (CDSS) analyze patient data to provide evidence-based recommendations. Integration with EHR workflows improves diagnostic accuracy and treatment outcomes."}
要求:
  • 最少1000个示例(推荐10,000+)
  • 每个任务最多100,000个示例
  • 仅包含未标注文本
  • JSONL格式
  • 每个文档最多32K tokens

Reinforcement Fine-Tuning Format (JSONL)

强化微调格式(JSONL)

jsonl
{"prompt": "Explain type 2 diabetes to a patient.", "chosen": "Type 2 diabetes is a condition where your body doesn't use insulin properly. This causes high blood sugar. Managing it involves healthy eating, exercise, and sometimes medication.", "rejected": "Type 2 diabetes mellitus is characterized by insulin resistance and relative insulin deficiency leading to hyperglycemia."}
{"prompt": "What should I do if I miss a dose?", "chosen": "If you miss a dose, take it as soon as you remember. If it's almost time for your next dose, skip the missed one. Don't double up. Call your doctor if you have questions.", "rejected": "Consult the prescribing information or contact your healthcare provider immediately."}
Requirements:
  • Minimum 100 preference pairs (recommended: 1000+)
  • Each example: prompt + chosen response + rejected response
  • JSONL format
  • Max 32K tokens per example
  • Ranking score optional (0.0-1.0)
jsonl
{"prompt": "Explain type 2 diabetes to a patient.", "chosen": "Type 2 diabetes is a condition where your body doesn't use insulin properly. This causes high blood sugar. Managing it involves healthy eating, exercise, and sometimes medication.", "rejected": "Type 2 diabetes mellitus is characterized by insulin resistance and relative insulin deficiency leading to hyperglycemia."}
{"prompt": "What should I do if I miss a dose?", "chosen": "If you miss a dose, take it as soon as you remember. If it's almost time for your next dose, skip the missed one. Don't double up. Call your doctor if you have questions.", "rejected": "Consult the prescribing information or contact your healthcare provider immediately."}
要求:
  • 最少100个偏好对(推荐1000+)
  • 每个示例包含prompt + chosen response + rejected response
  • JSONL格式
  • 每个示例最多32K tokens
  • 可选排名分数(0.0-1.0)

Distillation Format (No Training Data Required)

蒸馏格式(无需训练数据)

Distillation uses the teacher model's outputs automatically:
python
undefined
蒸馏自动使用教师模型的输出:
python
undefined

Configuration only - no training data needed

Configuration only - no training data needed

distillation_config = { 'teacherModelId': 'anthropic.claude-3-5-sonnet-20241022-v2:0', 'studentModelId': 'anthropic.claude-3-haiku-20240307-v1:0', 'distillationDataSource': { 'promptDataset': { 's3Uri': 's3://bucket/prompts.jsonl' # Just prompts, no completions } } }

**Prompt Dataset Format**:
```jsonl
{"prompt": "Explain the water cycle."}
{"prompt": "What are the symptoms of the flu?"}
{"prompt": "Describe photosynthesis."}
Requirements:
  • Minimum 1000 prompts (recommended: 10,000+)
  • Teacher model generates completions automatically
  • Student model trained to match teacher outputs
distillation_config = { 'teacherModelId': 'anthropic.claude-3-5-sonnet-20241022-v2:0', 'studentModelId': 'anthropic.claude-3-haiku-20240307-v1:0', 'distillationDataSource': { 'promptDataset': { 's3Uri': 's3://bucket/prompts.jsonl' # Just prompts, no completions } } }

**提示数据集格式**:
```jsonl
{"prompt": "Explain the water cycle."}
{"prompt": "What are the symptoms of the flu?"}
{"prompt": "Describe photosynthesis."}
要求:
  • 最少1000个提示(推荐10,000+)
  • 教师模型自动生成补全内容
  • 学生模型被训练以匹配教师模型的输出

Quick Start

快速入门

1. Prepare Training Data

1. 准备训练数据

python
import json
python
import json

Fine-tuning examples

Fine-tuning examples

training_data = [ { "prompt": "Classify sentiment: This product exceeded my expectations!", "completion": "Positive" }, { "prompt": "Classify sentiment: Terrible customer service, very disappointed.", "completion": "Negative" }, { "prompt": "Classify sentiment: The item was okay, nothing special.", "completion": "Neutral" } ]
training_data = [ { "prompt": "Classify sentiment: This product exceeded my expectations!", "completion": "Positive" }, { "prompt": "Classify sentiment: Terrible customer service, very disappointed.", "completion": "Negative" }, { "prompt": "Classify sentiment: The item was okay, nothing special.", "completion": "Neutral" } ]

Save as JSONL

Save as JSONL

with open('training_data.jsonl', 'w') as f: for example in training_data: f.write(json.dumps(example) + '\n')
undefined
with open('training_data.jsonl', 'w') as f: for example in training_data: f.write(json.dumps(example) + '\n')
undefined

2. Upload to S3

2. 上传至S3

python
import boto3

s3 = boto3.client('s3')
bucket_name = 'my-bedrock-training-bucket'
python
import boto3

s3 = boto3.client('s3')
bucket_name = 'my-bedrock-training-bucket'

Upload training data

Upload training data

s3.upload_file('training_data.jsonl', bucket_name, 'fine-tuning/training_data.jsonl')
s3.upload_file('training_data.jsonl', bucket_name, 'fine-tuning/training_data.jsonl')

Upload validation data (optional but recommended)

Upload validation data (optional but recommended)

s3.upload_file('validation_data.jsonl', bucket_name, 'fine-tuning/validation_data.jsonl')
undefined
s3.upload_file('validation_data.jsonl', bucket_name, 'fine-tuning/validation_data.jsonl')
undefined

3. Create Customization Job

3. 创建定制任务

python
bedrock = boto3.client('bedrock')

response = bedrock.create_model_customization_job(
    jobName='sentiment-classifier-v1',
    customModelName='sentiment-classifier',
    roleArn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
    baseModelIdentifier='anthropic.claude-3-haiku-20240307-v1:0',
    trainingDataConfig={
        's3Uri': f's3://{bucket_name}/fine-tuning/training_data.jsonl'
    },
    validationDataConfig={
        's3Uri': f's3://{bucket_name}/fine-tuning/validation_data.jsonl'
    },
    outputDataConfig={
        's3Uri': f's3://{bucket_name}/fine-tuning/output/'
    },
    hyperParameters={
        'epochCount': '3',
        'batchSize': '8',
        'learningRate': '0.00001'
    }
)

job_arn = response['jobArn']
print(f"Customization job created: {job_arn}")
python
bedrock = boto3.client('bedrock')

response = bedrock.create_model_customization_job(
    jobName='sentiment-classifier-v1',
    customModelName='sentiment-classifier',
    roleArn='arn:aws:iam::123456789012:role/BedrockCustomizationRole',
    baseModelIdentifier='anthropic.claude-3-haiku-20240307-v1:0',
    trainingDataConfig={
        's3Uri': f's3://{bucket_name}/fine-tuning/training_data.jsonl'
    },
    validationDataConfig={
        's3Uri': f's3://{bucket_name}/fine-tuning/validation_data.jsonl'
    },
    outputDataConfig={
        's3Uri': f's3://{bucket_name}/fine-tuning/output/'
    },
    hyperParameters={
        'epochCount': '3',
        'batchSize': '8',
        'learningRate': '0.00001'
    }
)

job_arn = response['jobArn']
print(f"Customization job created: {job_arn}")

4. Monitor Training

4. 监控训练

python
undefined
python
undefined

Check job status

Check job status

response = bedrock.get_model_customization_job(jobIdentifier=job_arn) status = response['status'] # InProgress, Completed, Failed, Stopped
print(f"Job status: {status}")
if status == 'Completed': custom_model_arn = response['outputModelArn'] print(f"Custom model ARN: {custom_model_arn}")
undefined
response = bedrock.get_model_customization_job(jobIdentifier=job_arn) status = response['status'] # InProgress, Completed, Failed, Stopped
print(f"Job status: {status}")
if status == 'Completed': custom_model_arn = response['outputModelArn'] print(f"Custom model ARN: {custom_model_arn}")
undefined

5. Deploy and Test

5. 部署与测试

python
bedrock_runtime = boto3.client('bedrock-runtime')
python
bedrock_runtime = boto3.client('bedrock-runtime')

Use custom model

Use custom model

response = bedrock_runtime.invoke_model( modelId=custom_model_arn, body=json.dumps({ "prompt": "Classify sentiment: I love this product!", "max_tokens": 50 }) )
result = json.loads(response['body'].read()) print(f"Prediction: {result['completion']}")
undefined
response = bedrock_runtime.invoke_model( modelId=custom_model_arn, body=json.dumps({ "prompt": "Classify sentiment: I love this product!", "max_tokens": 50 }) )
result = json.loads(response['body'].read()) print(f"Prediction: {result['completion']}")
undefined

Operations

操作指南

create-fine-tuning-job

create-fine-tuning-job

Create a supervised fine-tuning job with labeled examples.
python
import boto3
import json

def create_fine_tuning_job(
    job_name: str,
    model_name: str,
    base_model_id: str,
    training_s3_uri: str,
    output_s3_uri: str,
    role_arn: str,
    validation_s3_uri: str = None,
    hyper_params: dict = None
) -> str:
    """
    Create fine-tuning job for task-specific adaptation.

    Args:
        job_name: Unique job identifier
        model_name: Name for custom model
        base_model_id: Base model ARN (e.g., Claude 3 Haiku)
        training_s3_uri: S3 path to training JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role with Bedrock + S3 permissions
        validation_s3_uri: Optional validation dataset
        hyper_params: Training hyperparameters

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    # Default hyperparameters
    if hyper_params is None:
        hyper_params = {
            'epochCount': '3',           # Number of training epochs
            'batchSize': '8',            # Batch size (4, 8, 16, 32)
            'learningRate': '0.00001',   # Learning rate (0.00001 - 0.0001)
            'learningRateWarmupSteps': '0'
        }

    # Build configuration
    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': base_model_id,
        'trainingDataConfig': {
            's3Uri': training_s3_uri
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': hyper_params,
        'customizationType': 'FINE_TUNING'
    }

    # Add validation data if provided
    if validation_s3_uri:
        config['validationDataConfig'] = {
            's3Uri': validation_s3_uri
        }

    # Create job
    response = bedrock.create_model_customization_job(**config)

    print(f"Fine-tuning job created: {response['jobArn']}")
    return response['jobArn']
使用标注示例创建监督微调任务。
python
import boto3
import json

def create_fine_tuning_job(
    job_name: str,
    model_name: str,
    base_model_id: str,
    training_s3_uri: str,
    output_s3_uri: str,
    role_arn: str,
    validation_s3_uri: str = None,
    hyper_params: dict = None
) -> str:
    """
    Create fine-tuning job for task-specific adaptation.

    Args:
        job_name: Unique job identifier
        model_name: Name for custom model
        base_model_id: Base model ARN (e.g., Claude 3 Haiku)
        training_s3_uri: S3 path to training JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role with Bedrock + S3 permissions
        validation_s3_uri: Optional validation dataset
        hyper_params: Training hyperparameters

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    # Default hyperparameters
    if hyper_params is None:
        hyper_params = {
            'epochCount': '3',           # Number of training epochs
            'batchSize': '8',            # Batch size (4, 8, 16, 32)
            'learningRate': '0.00001',   # Learning rate (0.00001 - 0.0001)
            'learningRateWarmupSteps': '0'
        }

    # Build configuration
    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': base_model_id,
        'trainingDataConfig': {
            's3Uri': training_s3_uri
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': hyper_params,
        'customizationType': 'FINE_TUNING'
    }

    # Add validation data if provided
    if validation_s3_uri:
        config['validationDataConfig'] = {
            's3Uri': validation_s3_uri
        }

    # Create job
    response = bedrock.create_model_customization_job(**config)

    print(f"Fine-tuning job created: {response['jobArn']}")
    return response['jobArn']

Example: Fine-tune Claude 3 Haiku for medical classification

Example: Fine-tune Claude 3 Haiku for medical classification

job_arn = create_fine_tuning_job( job_name='medical-classifier-v1', model_name='medical-classifier', base_model_id='anthropic.claude-3-haiku-20240307-v1:0', training_s3_uri='s3://my-bucket/medical/training.jsonl', output_s3_uri='s3://my-bucket/medical/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole', validation_s3_uri='s3://my-bucket/medical/validation.jsonl', hyper_params={ 'epochCount': '5', 'batchSize': '16', 'learningRate': '0.00002' } )
undefined
job_arn = create_fine_tuning_job( job_name='medical-classifier-v1', model_name='medical-classifier', base_model_id='anthropic.claude-3-haiku-20240307-v1:0', training_s3_uri='s3://my-bucket/medical/training.jsonl', output_s3_uri='s3://my-bucket/medical/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole', validation_s3_uri='s3://my-bucket/medical/validation.jsonl', hyper_params={ 'epochCount': '5', 'batchSize': '16', 'learningRate': '0.00002' } )
undefined

create-continued-pretraining-job

create-continued-pretraining-job

Create continued pre-training job for domain adaptation.
python
def create_continued_pretraining_job(
    job_name: str,
    model_name: str,
    base_model_id: str,
    training_s3_uri: str,
    output_s3_uri: str,
    role_arn: str,
    validation_s3_uri: str = None
) -> str:
    """
    Create continued pre-training job for domain knowledge.

    Args:
        job_name: Unique job identifier
        model_name: Name for custom model
        base_model_id: Base model ARN
        training_s3_uri: S3 path to unlabeled text JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role ARN
        validation_s3_uri: Optional validation dataset

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': base_model_id,
        'trainingDataConfig': {
            's3Uri': training_s3_uri
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': {
            'epochCount': '1',  # Usually 1 epoch for continued pre-training
            'batchSize': '16',
            'learningRate': '0.000005'  # Lower LR for stability
        },
        'customizationType': 'CONTINUED_PRE_TRAINING'
    }

    if validation_s3_uri:
        config['validationDataConfig'] = {
            's3Uri': validation_s3_uri
        }

    response = bedrock.create_model_customization_job(**config)

    print(f"Continued pre-training job created: {response['jobArn']}")
    return response['jobArn']
创建持续预训练任务以实现领域适配。
python
def create_continued_pretraining_job(
    job_name: str,
    model_name: str,
    base_model_id: str,
    training_s3_uri: str,
    output_s3_uri: str,
    role_arn: str,
    validation_s3_uri: str = None
) -> str:
    """
    Create continued pre-training job for domain knowledge.

    Args:
        job_name: Unique job identifier
        model_name: Name for custom model
        base_model_id: Base model ARN
        training_s3_uri: S3 path to unlabeled text JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role ARN
        validation_s3_uri: Optional validation dataset

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': base_model_id,
        'trainingDataConfig': {
            's3Uri': training_s3_uri
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': {
            'epochCount': '1',  # Usually 1 epoch for continued pre-training
            'batchSize': '16',
            'learningRate': '0.000005'  # Lower LR for stability
        },
        'customizationType': 'CONTINUED_PRE_TRAINING'
    }

    if validation_s3_uri:
        config['validationDataConfig'] = {
            's3Uri': validation_s3_uri
        }

    response = bedrock.create_model_customization_job(**config)

    print(f"Continued pre-training job created: {response['jobArn']}")
    return response['jobArn']

Example: Adapt Claude for medical domain

Example: Adapt Claude for medical domain

job_arn = create_continued_pretraining_job( job_name='medical-domain-adapter-v1', model_name='claude-medical', base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0', training_s3_uri='s3://my-bucket/medical-corpus/documents.jsonl', output_s3_uri='s3://my-bucket/medical-corpus/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole' )
undefined
job_arn = create_continued_pretraining_job( job_name='medical-domain-adapter-v1', model_name='claude-medical', base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0', training_s3_uri='s3://my-bucket/medical-corpus/documents.jsonl', output_s3_uri='s3://my-bucket/medical-corpus/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole' )
undefined

create-reinforcement-finetuning-job

create-reinforcement-finetuning-job

Create reinforcement fine-tuning job with preference data (NEW 2025).
python
def create_reinforcement_finetuning_job(
    job_name: str,
    model_name: str,
    base_model_id: str,
    preference_s3_uri: str,
    output_s3_uri: str,
    role_arn: str,
    algorithm: str = 'DPO'  # DPO, PPO, or RLAIF
) -> str:
    """
    Create reinforcement fine-tuning job for alignment (NEW 2025).

    Args:
        job_name: Unique job identifier
        model_name: Name for custom model
        base_model_id: Base model ARN
        preference_s3_uri: S3 path to preference pairs JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role ARN
        algorithm: RL algorithm (DPO, PPO, RLAIF)

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': base_model_id,
        'trainingDataConfig': {
            's3Uri': preference_s3_uri
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': {
            'epochCount': '3',
            'batchSize': '8',
            'learningRate': '0.00001',
            'rlAlgorithm': algorithm,
            'beta': '0.1'  # KL divergence coefficient
        },
        'customizationType': 'REINFORCEMENT_FINE_TUNING'
    }

    response = bedrock.create_model_customization_job(**config)

    print(f"Reinforcement fine-tuning job created: {response['jobArn']}")
    print(f"Expected accuracy gains: 40-66% improvement")
    return response['jobArn']
使用偏好数据创建强化微调任务(2025年新功能)。
python
def create_reinforcement_finetuning_job(
    job_name: str,
    model_name: str,
    base_model_id: str,
    preference_s3_uri: str,
    output_s3_uri: str,
    role_arn: str,
    algorithm: str = 'DPO'  # DPO, PPO, or RLAIF
) -> str:
    """
    Create reinforcement fine-tuning job for alignment (NEW 2025).

    Args:
        job_name: Unique job identifier
        model_name: Name for custom model
        base_model_id: Base model ARN
        preference_s3_uri: S3 path to preference pairs JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role ARN
        algorithm: RL algorithm (DPO, PPO, RLAIF)

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': base_model_id,
        'trainingDataConfig': {
            's3Uri': preference_s3_uri
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': {
            'epochCount': '3',
            'batchSize': '8',
            'learningRate': '0.00001',
            'rlAlgorithm': algorithm,
            'beta': '0.1'  # KL divergence coefficient
        },
        'customizationType': 'REINFORCEMENT_FINE_TUNING'
    }

    response = bedrock.create_model_customization_job(**config)

    print(f"Reinforcement fine-tuning job created: {response['jobArn']}")
    print(f"Expected accuracy gains: 40-66% improvement")
    return response['jobArn']

Example: Improve response quality with preference learning

Example: Improve response quality with preference learning

job_arn = create_reinforcement_finetuning_job( job_name='claude-aligned-v1', model_name='claude-aligned', base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0', preference_s3_uri='s3://my-bucket/preferences/pairs.jsonl', output_s3_uri='s3://my-bucket/preferences/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole', algorithm='DPO' # Direct Preference Optimization )
undefined
job_arn = create_reinforcement_finetuning_job( job_name='claude-aligned-v1', model_name='claude-aligned', base_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0', preference_s3_uri='s3://my-bucket/preferences/pairs.jsonl', output_s3_uri='s3://my-bucket/preferences/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole', algorithm='DPO' # Direct Preference Optimization )
undefined

create-distillation-job

create-distillation-job

Create distillation job to transfer knowledge from large to small model.
python
def create_distillation_job(
    job_name: str,
    model_name: str,
    teacher_model_id: str,
    student_model_id: str,
    prompts_s3_uri: str,
    output_s3_uri: str,
    role_arn: str
) -> str:
    """
    Create distillation job to compress large model knowledge.

    Args:
        job_name: Unique job identifier
        model_name: Name for distilled model
        teacher_model_id: Large model to learn from
        student_model_id: Small model to train
        prompts_s3_uri: S3 path to prompts JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role ARN

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': student_model_id,
        'trainingDataConfig': {
            's3Uri': prompts_s3_uri,
            'teacherModelIdentifier': teacher_model_id
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': {
            'epochCount': '3',
            'batchSize': '16',
            'learningRate': '0.00002',
            'temperature': '1.0',  # Softmax temperature for distillation
            'alpha': '0.5'         # Balance between hard and soft targets
        },
        'customizationType': 'DISTILLATION'
    }

    response = bedrock.create_model_customization_job(**config)

    print(f"Distillation job created: {response['jobArn']}")
    print(f"Teacher: {teacher_model_id}")
    print(f"Student: {student_model_id}")
    print(f"Expected: 80-90% teacher quality at 50-70% cost")
    return response['jobArn']
创建蒸馏任务,将大模型的知识迁移到小模型中。
python
def create_distillation_job(
    job_name: str,
    model_name: str,
    teacher_model_id: str,
    student_model_id: str,
    prompts_s3_uri: str,
    output_s3_uri: str,
    role_arn: str
) -> str:
    """
    Create distillation job to compress large model knowledge.

    Args:
        job_name: Unique job identifier
        model_name: Name for distilled model
        teacher_model_id: Large model to learn from
        student_model_id: Small model to train
        prompts_s3_uri: S3 path to prompts JSONL
        output_s3_uri: S3 path for outputs
        role_arn: IAM role ARN

    Returns:
        Job ARN for monitoring
    """
    bedrock = boto3.client('bedrock')

    config = {
        'jobName': job_name,
        'customModelName': model_name,
        'roleArn': role_arn,
        'baseModelIdentifier': student_model_id,
        'trainingDataConfig': {
            's3Uri': prompts_s3_uri,
            'teacherModelIdentifier': teacher_model_id
        },
        'outputDataConfig': {
            's3Uri': output_s3_uri
        },
        'hyperParameters': {
            'epochCount': '3',
            'batchSize': '16',
            'learningRate': '0.00002',
            'temperature': '1.0',  # Softmax temperature for distillation
            'alpha': '0.5'         # Balance between hard and soft targets
        },
        'customizationType': 'DISTILLATION'
    }

    response = bedrock.create_model_customization_job(**config)

    print(f"Distillation job created: {response['jobArn']}")
    print(f"Teacher: {teacher_model_id}")
    print(f"Student: {student_model_id}")
    print(f"Expected: 80-90% teacher quality at 50-70% cost")
    return response['jobArn']

Example: Distill Claude 3.5 Sonnet to Haiku

Example: Distill Claude 3.5 Sonnet to Haiku

job_arn = create_distillation_job( job_name='claude-haiku-distilled-v1', model_name='claude-haiku-distilled', teacher_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0', student_model_id='anthropic.claude-3-haiku-20240307-v1:0', prompts_s3_uri='s3://my-bucket/distillation/prompts.jsonl', output_s3_uri='s3://my-bucket/distillation/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole' )
undefined
job_arn = create_distillation_job( job_name='claude-haiku-distilled-v1', model_name='claude-haiku-distilled', teacher_model_id='anthropic.claude-3-5-sonnet-20241022-v2:0', student_model_id='anthropic.claude-3-haiku-20240307-v1:0', prompts_s3_uri='s3://my-bucket/distillation/prompts.jsonl', output_s3_uri='s3://my-bucket/distillation/output/', role_arn='arn:aws:iam::123456789012:role/BedrockCustomizationRole' )
undefined

monitor-job

monitor-job

Track training progress and retrieve metrics.
python
import time
from typing import Dict, Any

def monitor_job(job_arn: str, poll_interval: int = 60) -> Dict[str, Any]:
    """
    Monitor customization job until completion.

    Args:
        job_arn: Job ARN to monitor
        poll_interval: Seconds between status checks

    Returns:
        Final job details with metrics
    """
    bedrock = boto3.client('bedrock')

    print(f"Monitoring job: {job_arn}")

    while True:
        response = bedrock.get_model_customization_job(
            jobIdentifier=job_arn
        )

        status = response['status']

        print(f"Status: {status}", end='')

        # Show metrics if available
        if 'trainingMetrics' in response:
            metrics = response['trainingMetrics']
            if 'trainingLoss' in metrics:
                print(f" | Loss: {metrics['trainingLoss']:.4f}", end='')

        print()  # Newline

        # Check terminal states
        if status == 'Completed':
            print(f"Job completed successfully!")
            print(f"Custom model ARN: {response['outputModelArn']}")
            return response

        elif status == 'Failed':
            print(f"Job failed: {response.get('failureMessage', 'Unknown error')}")
            return response

        elif status == 'Stopped':
            print(f"Job was stopped")
            return response

        # Wait before next check
        time.sleep(poll_interval)
跟踪训练进度并获取指标。
python
import time
from typing import Dict, Any

def monitor_job(job_arn: str, poll_interval: int = 60) -> Dict[str, Any]:
    """
    Monitor customization job until completion.

    Args:
        job_arn: Job ARN to monitor
        poll_interval: Seconds between status checks

    Returns:
        Final job details with metrics
    """
    bedrock = boto3.client('bedrock')

    print(f"Monitoring job: {job_arn}")

    while True:
        response = bedrock.get_model_customization_job(
            jobIdentifier=job_arn
        )

        status = response['status']

        print(f"Status: {status}", end='')

        # Show metrics if available
        if 'trainingMetrics' in response:
            metrics = response['trainingMetrics']
            if 'trainingLoss' in metrics:
                print(f" | Loss: {metrics['trainingLoss']:.4f}", end='')

        print()  # Newline

        # Check terminal states
        if status == 'Completed':
            print(f"Job completed successfully!")
            print(f"Custom model ARN: {response['outputModelArn']}")
            return response

        elif status == 'Failed':
            print(f"Job failed: {response.get('failureMessage', 'Unknown error')}")
            return response

        elif status == 'Stopped':
            print(f"Job was stopped")
            return response

        # Wait before next check
        time.sleep(poll_interval)

Example: Monitor with automatic polling

Example: Monitor with automatic polling

job_details = monitor_job(job_arn, poll_interval=60)
if job_details['status'] == 'Completed': custom_model_arn = job_details['outputModelArn']
# Download metrics from S3
output_uri = job_details['outputDataConfig']['s3Uri']
print(f"Metrics available at: {output_uri}")
undefined
job_details = monitor_job(job_arn, poll_interval=60)
if job_details['status'] == 'Completed': custom_model_arn = job_details['outputModelArn']
# Download metrics from S3
output_uri = job_details['outputDataConfig']['s3Uri']
print(f"Metrics available at: {output_uri}")
undefined

deploy-custom-model

deploy-custom-model

Provision custom model for inference.
python
def deploy_custom_model(
    model_arn: str,
    provisioned_model_name: str,
    model_units: int = 1
) -> str:
    """
    Deploy custom model with provisioned throughput.

    Args:
        model_arn: Custom model ARN from training job
        provisioned_model_name: Name for provisioned model
        model_units: Throughput units (1-10)

    Returns:
        Provisioned model ARN for inference
    """
    bedrock = boto3.client('bedrock')

    response = bedrock.create_provisioned_model_throughput(
        provisionedModelName=provisioned_model_name,
        modelId=model_arn,
        modelUnits=model_units
    )

    provisioned_arn = response['provisionedModelArn']

    print(f"Provisioned model created: {provisioned_arn}")
    print(f"Throughput: {model_units} units")
    print(f"Allow 5-10 minutes for provisioning")

    return provisioned_arn
部署定制模型以用于推理。
python
def deploy_custom_model(
    model_arn: str,
    provisioned_model_name: str,
    model_units: int = 1
) -> str:
    """
    Deploy custom model with provisioned throughput.

    Args:
        model_arn: Custom model ARN from training job
        provisioned_model_name: Name for provisioned model
        model_units: Throughput units (1-10)

    Returns:
        Provisioned model ARN for inference
    """
    bedrock = boto3.client('bedrock')

    response = bedrock.create_provisioned_model_throughput(
        provisionedModelName=provisioned_model_name,
        modelId=model_arn,
        modelUnits=model_units
    )

    provisioned_arn = response['provisionedModelArn']

    print(f"Provisioned model created: {provisioned_arn}")
    print(f"Throughput: {model_units} units")
    print(f"Allow 5-10 minutes for provisioning")

    return provisioned_arn

Example: Deploy with standard throughput

Example: Deploy with standard throughput

provisioned_arn = deploy_custom_model( model_arn='arn:aws:bedrock:us-east-1:123456789012:custom-model/medical-classifier-v1', provisioned_model_name='medical-classifier-prod', model_units=2 )
provisioned_arn = deploy_custom_model( model_arn='arn:aws:bedrock:us-east-1:123456789012:custom-model/medical-classifier-v1', provisioned_model_name='medical-classifier-prod', model_units=2 )

Wait for provisioning

Wait for provisioning

time.sleep(300) # 5 minutes
time.sleep(300) # 5 minutes

Use provisioned model

Use provisioned model

bedrock_runtime = boto3.client('bedrock-runtime')
response = bedrock_runtime.invoke_model( modelId=provisioned_arn, body=json.dumps({ "prompt": "Classify: Patient has fever and cough.", "max_tokens": 100 }) )
result = json.loads(response['body'].read()) print(f"Prediction: {result['completion']}")
undefined
bedrock_runtime = boto3.client('bedrock-runtime')
response = bedrock_runtime.invoke_model( modelId=provisioned_arn, body=json.dumps({ "prompt": "Classify: Patient has fever and cough.", "max_tokens": 100 }) )
result = json.loads(response['body'].read()) print(f"Prediction: {result['completion']}")
undefined

evaluate-model

evaluate-model

Test custom model performance with evaluation dataset.
python
import pandas as pd
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

def evaluate_model(
    model_id: str,
    test_data_path: str,
    output_path: str = None
) -> Dict[str, float]:
    """
    Evaluate custom model on test dataset.

    Args:
        model_id: Custom model ARN
        test_data_path: Path to test JSONL file
        output_path: Optional path to save predictions

    Returns:
        Evaluation metrics dictionary
    """
    bedrock_runtime = boto3.client('bedrock-runtime')

    # Load test data
    test_data = []
    with open(test_data_path, 'r') as f:
        for line in f:
            test_data.append(json.loads(line))

    # Run predictions
    predictions = []
    ground_truth = []

    print(f"Evaluating {len(test_data)} examples...")

    for i, example in enumerate(test_data):
        if i % 10 == 0:
            print(f"Progress: {i}/{len(test_data)}")

        # Invoke model
        response = bedrock_runtime.invoke_model(
            modelId=model_id,
            body=json.dumps({
                "prompt": example['prompt'],
                "max_tokens": 200
            })
        )

        result = json.loads(response['body'].read())
        prediction = result['completion'].strip()

        predictions.append(prediction)
        ground_truth.append(example['completion'].strip())

    # Calculate metrics
    accuracy = accuracy_score(ground_truth, predictions)
    precision, recall, f1, _ = precision_recall_fscore_support(
        ground_truth, predictions, average='weighted', zero_division=0
    )

    metrics = {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1,
        'total_examples': len(test_data)
    }

    print("\n=== Evaluation Results ===")
    print(f"Accuracy:  {accuracy:.4f}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall:    {recall:.4f}")
    print(f"F1 Score:  {f1:.4f}")

    # Save predictions if requested
    if output_path:
        results_df = pd.DataFrame({
            'prompt': [ex['prompt'] for ex in test_data],
            'ground_truth': ground_truth,
            'prediction': predictions
        })
        results_df.to_csv(output_path, index=False)
        print(f"Predictions saved to: {output_path}")

    return metrics
使用评估数据集测试定制模型的性能。
python
import pandas as pd
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

def evaluate_model(
    model_id: str,
    test_data_path: str,
    output_path: str = None
) -> Dict[str, float]:
    """
    Evaluate custom model on test dataset.

    Args:
        model_id: Custom model ARN
        test_data_path: Path to test JSONL file
        output_path: Optional path to save predictions

    Returns:
        Evaluation metrics dictionary
    """
    bedrock_runtime = boto3.client('bedrock-runtime')

    # Load test data
    test_data = []
    with open(test_data_path, 'r') as f:
        for line in f:
            test_data.append(json.loads(line))

    # Run predictions
    predictions = []
    ground_truth = []

    print(f"Evaluating {len(test_data)} examples...")

    for i, example in enumerate(test_data):
        if i % 10 == 0:
            print(f"Progress: {i}/{len(test_data)}")

        # Invoke model
        response = bedrock_runtime.invoke_model(
            modelId=model_id,
            body=json.dumps({
                "prompt": example['prompt'],
                "max_tokens": 200
            })
        )

        result = json.loads(response['body'].read())
        prediction = result['completion'].strip()

        predictions.append(prediction)
        ground_truth.append(example['completion'].strip())

    # Calculate metrics
    accuracy = accuracy_score(ground_truth, predictions)
    precision, recall, f1, _ = precision_recall_fscore_support(
        ground_truth, predictions, average='weighted', zero_division=0
    )

    metrics = {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1,
        'total_examples': len(test_data)
    }

    print("\n=== Evaluation Results ===")
    print(f"Accuracy:  {accuracy:.4f}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall:    {recall:.4f}")
    print(f"F1 Score:  {f1:.4f}")

    # Save predictions if requested
    if output_path:
        results_df = pd.DataFrame({
            'prompt': [ex['prompt'] for ex in test_data],
            'ground_truth': ground_truth,
            'prediction': predictions
        })
        results_df.to_csv(output_path, index=False)
        print(f"Predictions saved to: {output_path}")

    return metrics

Example: Evaluate medical classifier

Example: Evaluate medical classifier

metrics = evaluate_model( model_id='arn:aws:bedrock:us-east-1:123456789012:provisioned-model/medical-classifier-prod', test_data_path='test_data.jsonl', output_path='evaluation_results.csv' )
undefined
metrics = evaluate_model( model_id='arn:aws:bedrock:us-east-1:123456789012:provisioned-model/medical-classifier-prod', test_data_path='test_data.jsonl', output_path='evaluation_results.csv' )
undefined

Hyperparameter Tuning

超参数调优

Fine-Tuning Parameters

微调参数

ParameterRangeDefaultDescription
epochCount1-103Training passes over dataset
batchSize4-328Examples per training step
learningRate0.00001-0.00010.00001Step size for weight updates
learningRateWarmupSteps0-1000Gradual LR increase steps
Tuning Guidelines:
  • Small dataset (<100 examples): Lower epochs (1-2), smaller batch (4-8)
  • Medium dataset (100-1000): Standard settings (3 epochs, batch 8-16)
  • Large dataset (>1000): Higher epochs (5-10), larger batch (16-32)
  • Overfitting signs: Reduce epochs or increase batch size
  • Underfitting signs: Increase epochs or decrease learning rate
参数范围默认值描述
epochCount1-103训练遍历数据集的次数
batchSize4-328每个训练步骤的示例数量
learningRate0.00001-0.00010.00001权重更新的步长
learningRateWarmupSteps0-1000学习率逐步提升的步数
调优指南:
  • 小数据集(<100个示例):减少训练轮数(1-2),减小批次大小(4-8)
  • 中等数据集(100-1000个示例):使用标准设置(3轮训练,批次8-16)
  • 大数据集(>1000个示例):增加训练轮数(5-10),增大批次大小(16-32)
  • 过拟合迹象:减少训练轮数或增大批次大小
  • 欠拟合迹象:增加训练轮数或降低学习率

Example Configurations

示例配置

python
undefined
python
undefined

Configuration 1: Small dataset, quick iteration

Configuration 1: Small dataset, quick iteration

small_dataset_params = { 'epochCount': '2', 'batchSize': '4', 'learningRate': '0.00002', 'learningRateWarmupSteps': '10' }
small_dataset_params = { 'epochCount': '2', 'batchSize': '4', 'learningRate': '0.00002', 'learningRateWarmupSteps': '10' }

Configuration 2: Balanced, general purpose

Configuration 2: Balanced, general purpose

balanced_params = { 'epochCount': '3', 'batchSize': '8', 'learningRate': '0.00001', 'learningRateWarmupSteps': '0' }
balanced_params = { 'epochCount': '3', 'batchSize': '8', 'learningRate': '0.00001', 'learningRateWarmupSteps': '0' }

Configuration 3: Large dataset, high quality

Configuration 3: Large dataset, high quality

large_dataset_params = { 'epochCount': '5', 'batchSize': '16', 'learningRate': '0.000005', 'learningRateWarmupSteps': '20' }
large_dataset_params = { 'epochCount': '5', 'batchSize': '16', 'learningRate': '0.000005', 'learningRateWarmupSteps': '20' }

Configuration 4: Continued pre-training

Configuration 4: Continued pre-training

pretraining_params = { 'epochCount': '1', 'batchSize': '16', 'learningRate': '0.000005', 'learningRateWarmupSteps': '0' }
undefined
pretraining_params = { 'epochCount': '1', 'batchSize': '16', 'learningRate': '0.000005', 'learningRateWarmupSteps': '0' }
undefined

Data Preparation Best Practices

数据准备最佳实践

1. Data Quality

1. 数据质量

python
def validate_training_data(data_path: str) -> bool:
    """
    Validate training data quality.

    Checks:
    - JSONL format validity
    - Required fields present
    - Token length within limits
    - Data distribution balance
    """
    import json
    from collections import Counter

    issues = []
    completion_distribution = Counter()

    with open(data_path, 'r') as f:
        for i, line in enumerate(f, 1):
            try:
                example = json.loads(line)
            except json.JSONDecodeError:
                issues.append(f"Line {i}: Invalid JSON")
                continue

            # Check required fields
            if 'prompt' not in example:
                issues.append(f"Line {i}: Missing 'prompt' field")
            if 'completion' not in example:
                issues.append(f"Line {i}: Missing 'completion' field")

            # Track completion distribution
            if 'completion' in example:
                completion_distribution[example['completion']] += 1

            # Check token length (approximate)
            prompt_tokens = len(example.get('prompt', '').split())
            completion_tokens = len(example.get('completion', '').split())
            total_tokens = prompt_tokens + completion_tokens

            if total_tokens > 8000:  # Conservative estimate
                issues.append(f"Line {i}: Likely exceeds 32K token limit")

    # Report issues
    if issues:
        print("Data Validation Issues:")
        for issue in issues[:10]:  # Show first 10
            print(f"  - {issue}")
        if len(issues) > 10:
            print(f"  ... and {len(issues) - 10} more issues")
        return False

    # Check distribution balance
    print("\nCompletion Distribution:")
    for completion, count in completion_distribution.most_common():
        print(f"  {completion}: {count}")

    # Warn about imbalance
    counts = list(completion_distribution.values())
    if max(counts) > 3 * min(counts):
        print("\nWarning: Imbalanced dataset detected")
        print("Consider balancing or stratified sampling")

    print("\nValidation passed!")
    return True
python
def validate_training_data(data_path: str) -> bool:
    """
    Validate training data quality.

    Checks:
    - JSONL format validity
    - Required fields present
    - Token length within limits
    - Data distribution balance
    """
    import json
    from collections import Counter

    issues = []
    completion_distribution = Counter()

    with open(data_path, 'r') as f:
        for i, line in enumerate(f, 1):
            try:
                example = json.loads(line)
            except json.JSONDecodeError:
                issues.append(f"Line {i}: Invalid JSON")
                continue

            # Check required fields
            if 'prompt' not in example:
                issues.append(f"Line {i}: Missing 'prompt' field")
            if 'completion' not in example:
                issues.append(f"Line {i}: Missing 'completion' field")

            # Track completion distribution
            if 'completion' in example:
                completion_distribution[example['completion']] += 1

            # Check token length (approximate)
            prompt_tokens = len(example.get('prompt', '').split())
            completion_tokens = len(example.get('completion', '').split())
            total_tokens = prompt_tokens + completion_tokens

            if total_tokens > 8000:  # Conservative estimate
                issues.append(f"Line {i}: Likely exceeds 32K token limit")

    # Report issues
    if issues:
        print("Data Validation Issues:")
        for issue in issues[:10]:  # Show first 10
            print(f"  - {issue}")
        if len(issues) > 10:
            print(f"  ... and {len(issues) - 10} more issues")
        return False

    # Check distribution balance
    print("\nCompletion Distribution:")
    for completion, count in completion_distribution.most_common():
        print(f"  {completion}: {count}")

    # Warn about imbalance
    counts = list(completion_distribution.values())
    if max(counts) > 3 * min(counts):
        print("\nWarning: Imbalanced dataset detected")
        print("Consider balancing or stratified sampling")

    print("\nValidation passed!")
    return True

Example usage

Example usage

validate_training_data('training_data.jsonl')
undefined
validate_training_data('training_data.jsonl')
undefined

2. Data Augmentation

2. 数据增强

python
def augment_training_data(
    input_path: str,
    output_path: str,
    augmentation_factor: int = 2
):
    """
    Augment training data with paraphrasing and variations.

    Args:
        input_path: Original training data
        output_path: Augmented output file
        augmentation_factor: Multiplier for dataset size
    """
    import random

    # Load original data
    original_data = []
    with open(input_path, 'r') as f:
        for line in f:
            original_data.append(json.loads(line))

    # Augmentation strategies
    prompt_prefixes = [
        "",
        "Please ",
        "Could you ",
        "I need you to "
    ]

    augmented_data = []

    for example in original_data:
        # Include original
        augmented_data.append(example)

        # Create variations
        for _ in range(augmentation_factor - 1):
            prefix = random.choice(prompt_prefixes)
            augmented_example = {
                'prompt': prefix + example['prompt'],
                'completion': example['completion']
            }
            augmented_data.append(augmented_example)

    # Save augmented data
    with open(output_path, 'w') as f:
        for example in augmented_data:
            f.write(json.dumps(example) + '\n')

    print(f"Augmented {len(original_data)}{len(augmented_data)} examples")
python
def augment_training_data(
    input_path: str,
    output_path: str,
    augmentation_factor: int = 2
):
    """
    Augment training data with paraphrasing and variations.

    Args:
        input_path: Original training data
        output_path: Augmented output file
        augmentation_factor: Multiplier for dataset size
    """
    import random

    # Load original data
    original_data = []
    with open(input_path, 'r') as f:
        for line in f:
            original_data.append(json.loads(line))

    # Augmentation strategies
    prompt_prefixes = [
        "",
        "Please ",
        "Could you ",
        "I need you to "
    ]

    augmented_data = []

    for example in original_data:
        # Include original
        augmented_data.append(example)

        # Create variations
        for _ in range(augmentation_factor - 1):
            prefix = random.choice(prompt_prefixes)
            augmented_example = {
                'prompt': prefix + example['prompt'],
                'completion': example['completion']
            }
            augmented_data.append(augmented_example)

    # Save augmented data
    with open(output_path, 'w') as f:
        for example in augmented_data:
            f.write(json.dumps(example) + '\n')

    print(f"Augmented {len(original_data)}{len(augmented_data)} examples")

Example usage

Example usage

augment_training_data('training_data.jsonl', 'training_data_augmented.jsonl')
undefined
augment_training_data('training_data.jsonl', 'training_data_augmented.jsonl')
undefined

3. Train/Validation Split

3. 训练/验证集拆分

python
def split_dataset(
    input_path: str,
    train_path: str,
    val_path: str,
    val_split: float = 0.2
):
    """
    Split dataset into training and validation sets.

    Args:
        input_path: Full dataset JSONL
        train_path: Output training JSONL
        val_path: Output validation JSONL
        val_split: Fraction for validation (0.1-0.3)
    """
    import random

    # Load data
    data = []
    with open(input_path, 'r') as f:
        for line in f:
            data.append(json.loads(line))

    # Shuffle
    random.shuffle(data)

    # Split
    val_size = int(len(data) * val_split)
    train_data = data[val_size:]
    val_data = data[:val_size]

    # Save
    with open(train_path, 'w') as f:
        for example in train_data:
            f.write(json.dumps(example) + '\n')

    with open(val_path, 'w') as f:
        for example in val_data:
            f.write(json.dumps(example) + '\n')

    print(f"Split: {len(train_data)} training, {len(val_data)} validation")
python
def split_dataset(
    input_path: str,
    train_path: str,
    val_path: str,
    val_split: float = 0.2
):
    """
    Split dataset into training and validation sets.

    Args:
        input_path: Full dataset JSONL
        train_path: Output training JSONL
        val_path: Output validation JSONL
        val_split: Fraction for validation (0.1-0.3)
    """
    import random

    # Load data
    data = []
    with open(input_path, 'r') as f:
        for line in f:
            data.append(json.loads(line))

    # Shuffle
    random.shuffle(data)

    # Split
    val_size = int(len(data) * val_split)
    train_data = data[val_size:]
    val_data = data[:val_size]

    # Save
    with open(train_path, 'w') as f:
        for example in train_data:
            f.write(json.dumps(example) + '\n')

    with open(val_path, 'w') as f:
        for example in val_data:
            f.write(json.dumps(example) + '\n')

    print(f"Split: {len(train_data)} training, {len(val_data)} validation")

Example usage

Example usage

split_dataset('full_dataset.jsonl', 'training.jsonl', 'validation.jsonl', val_split=0.2)
undefined
split_dataset('full_dataset.jsonl', 'training.jsonl', 'validation.jsonl', val_split=0.2)
undefined

Cost Considerations

成本考量

Training Costs

训练成本

Cost Structure:
  • Fine-tuning: $0.01-0.05 per 1000 tokens processed
  • Continued pre-training: $0.02-0.08 per 1000 tokens processed
  • Reinforcement fine-tuning: $0.03-0.10 per 1000 tokens processed
  • Distillation: $0.02-0.06 per 1000 tokens processed
Example Calculations:
python
def estimate_training_cost(
    num_examples: int,
    avg_tokens_per_example: int,
    num_epochs: int,
    cost_per_1k_tokens: float = 0.03
) -> float:
    """
    Estimate training cost.

    Args:
        num_examples: Number of training examples
        avg_tokens_per_example: Average tokens (prompt + completion)
        num_epochs: Training epochs
        cost_per_1k_tokens: Cost rate

    Returns:
        Estimated cost in USD
    """
    total_tokens = num_examples * avg_tokens_per_example * num_epochs
    cost = (total_tokens / 1000) * cost_per_1k_tokens

    print(f"Training Examples: {num_examples:,}")
    print(f"Avg Tokens/Example: {avg_tokens_per_example}")
    print(f"Epochs: {num_epochs}")
    print(f"Total Tokens: {total_tokens:,}")
    print(f"Estimated Cost: ${cost:.2f}")

    return cost
成本结构:
  • 微调:每处理1000个tokens花费0.01-0.05美元
  • 持续预训练:每处理1000个tokens花费0.02-0.08美元
  • 强化微调:每处理1000个tokens花费0.03-0.10美元
  • 蒸馏:每处理1000个tokens花费0.02-0.06美元
示例计算:
python
def estimate_training_cost(
    num_examples: int,
    avg_tokens_per_example: int,
    num_epochs: int,
    cost_per_1k_tokens: float = 0.03
) -> float:
    """
    Estimate training cost.

    Args:
        num_examples: Number of training examples
        avg_tokens_per_example: Average tokens (prompt + completion)
        num_epochs: Training epochs
        cost_per_1k_tokens: Cost rate

    Returns:
        Estimated cost in USD
    """
    total_tokens = num_examples * avg_tokens_per_example * num_epochs
    cost = (total_tokens / 1000) * cost_per_1k_tokens

    print(f"Training Examples: {num_examples:,}")
    print(f"Avg Tokens/Example: {avg_tokens_per_example}")
    print(f"Epochs: {num_epochs}")
    print(f"Total Tokens: {total_tokens:,}")
    print(f"Estimated Cost: ${cost:.2f}")

    return cost

Example: Fine-tune with 1000 examples

Example: Fine-tune with 1000 examples

estimate_training_cost( num_examples=1000, avg_tokens_per_example=500, num_epochs=3, cost_per_1k_tokens=0.03 )
estimate_training_cost( num_examples=1000, avg_tokens_per_example=500, num_epochs=3, cost_per_1k_tokens=0.03 )

Output: ~$45

Output: ~$45

undefined
undefined

Inference Costs

推理成本

Provisioned Throughput Pricing:
  • Model Units: $X per hour per unit
  • Cost varies by base model
  • Minimum commitment: 1 month or 6 months
Cost Optimization:
python
def compare_model_costs(
    requests_per_day: int,
    avg_tokens_per_request: int
):
    """
    Compare on-demand vs provisioned vs distilled model costs.
    """
    # Base Claude 3.5 Sonnet on-demand: $3/$15 per 1M tokens
    base_cost_input = (requests_per_day * avg_tokens_per_request * 30) / 1_000_000 * 3
    base_cost_output = (requests_per_day * avg_tokens_per_request * 0.5 * 30) / 1_000_000 * 15
    base_monthly = base_cost_input + base_cost_output

    # Provisioned throughput: ~$2500/month per unit
    provisioned_monthly = 2500

    # Distilled to Haiku: 50% cost reduction
    distilled_monthly = base_monthly * 0.5

    print(f"Monthly Cost Comparison ({requests_per_day:,} requests/day):")
    print(f"  Base Model On-Demand: ${base_monthly:.2f}")
    print(f"  Provisioned (1 unit):  ${provisioned_monthly:.2f}")
    print(f"  Distilled Model:       ${distilled_monthly:.2f}")

    # Breakeven analysis
    if base_monthly > provisioned_monthly:
        print(f"\nProvisioned throughput recommended (saves ${base_monthly - provisioned_monthly:.2f}/mo)")
    else:
        print(f"\nOn-demand recommended (saves ${provisioned_monthly - base_monthly:.2f}/mo)")
预置吞吐量定价:
  • 模型单元:每单元每小时X美元
  • 成本因基础模型而异
  • 最低承诺:1个月或6个月
成本优化:
python
def compare_model_costs(
    requests_per_day: int,
    avg_tokens_per_request: int
):
    """
    Compare on-demand vs provisioned vs distilled model costs.
    """
    # Base Claude 3.5 Sonnet on-demand: $3/$15 per 1M tokens
    base_cost_input = (requests_per_day * avg_tokens_per_request * 30) / 1_000_000 * 3
    base_cost_output = (requests_per_day * avg_tokens_per_request * 0.5 * 30) / 1_000_000 * 15
    base_monthly = base_cost_input + base_cost_output

    # Provisioned throughput: ~$2500/month per unit
    provisioned_monthly = 2500

    # Distilled to Haiku: 50% cost reduction
    distilled_monthly = base_monthly * 0.5

    print(f"Monthly Cost Comparison ({requests_per_day:,} requests/day):")
    print(f"  Base Model On-Demand: ${base_monthly:.2f}")
    print(f"  Provisioned (1 unit):  ${provisioned_monthly:.2f}")
    print(f"  Distilled Model:       ${distilled_monthly:.2f}")

    # Breakeven analysis
    if base_monthly > provisioned_monthly:
        print(f"\nProvisioned throughput recommended (saves ${base_monthly - provisioned_monthly:.2f}/mo)")
    else:
        print(f"\nOn-demand recommended (saves ${provisioned_monthly - base_monthly:.2f}/mo)")

Example comparison

Example comparison

compare_model_costs(requests_per_day=10000, avg_tokens_per_request=1000)
undefined
compare_model_costs(requests_per_day=10000, avg_tokens_per_request=1000)
undefined

Related Skills

相关技能

  • bedrock-inference: Invoke foundation models and custom models
  • bedrock-knowledge-bases: RAG with custom models
  • bedrock-guardrails: Apply safety policies to custom models
  • bedrock-agentcore: Build agents with custom models
  • claude-cost-optimization: Optimize model selection and costs
  • claude-context-management: Manage context for custom models
  • boto3-ecs: Deploy custom model inference on ECS
  • boto3-eks: Deploy custom model inference on EKS
  • bedrock-inference:调用基础模型和定制模型
  • bedrock-knowledge-bases:结合定制模型实现RAG
  • bedrock-guardrails:为定制模型应用安全策略
  • bedrock-agentcore:使用定制模型构建智能体
  • claude-cost-optimization:优化模型选择与成本
  • claude-context-management:为定制模型管理上下文
  • boto3-ecs:在ECS上部署定制模型推理
  • boto3-eks:在EKS上部署定制模型推理

Additional Resources

额外资源