Loading...
Loading...
Compare original and translation side by side
undefinedundefined
Record results systematically:
- Accuracy on validation set
- Model file size
- Training time
系统记录结果:
- 验证集精度
- 模型文件大小
- 训练时间| Parameter | Low | Medium | High | Impact |
|---|---|---|---|---|
| dim | 50 | 100 | 200 | Size, accuracy |
| epoch | 5 | 15 | 25 | Training time, accuracy |
| lr | 0.1 | 0.5 | 1.0 | Convergence speed |
| wordNgrams | 1 | 2 | 3 | Accuracy, size |
| 参数 | 低值 | 中值 | 高值 | 影响 |
|---|---|---|---|---|
| dim | 50 | 100 | 200 | 模型大小、精度 |
| epoch | 5 | 15 | 25 | 训练时间、精度 |
| lr | 0.1 | 0.5 | 1.0 | 收敛速度 |
| wordNgrams | 1 | 2 | 3 | 精度、模型大小 |
softmaxovanssoftmaxovansmodel.quantize(input=train_file, retrain=True)
model.save_model("model.ftz")model.quantize(input=train_file, retrain=True)
model.save_model("model.ftz")model = fasttext.train_supervised(
input=train_file,
autotuneValidationFile=valid_file,
autotuneDuration=600, # seconds
autotuneModelSize="150M" # target size constraint
)model = fasttext.train_supervised(
input=train_file,
autotuneValidationFile=valid_file,
autotuneDuration=600, # 秒
autotuneModelSize="150M" # 目标大小约束
)undefinedundefinedundefinedundefinedimport os
import fasttextimport os
import fasttext# Try loading to verify integrity
model = fasttext.load_model(model_path)
print(f"Labels: {len(model.labels)}")undefined# 尝试加载模型以验证完整性
model = fasttext.load_model(model_path)
print(f"Labels: {len(model.labels)}")undefinedimport time
start_time = time.time()
model = fasttext.train_supervised(input=train_file, epoch=25, verbose=2)
elapsed = time.time() - start_time
print(f"Training completed in {elapsed:.1f} seconds")import time
start_time = time.time()
model = fasttext.train_supervised(input=train_file, epoch=25, verbose=2)
elapsed = time.time() - start_time
print(f"Training completed in {elapsed:.1f} seconds")for epoch in [5, 10, 15, 20, 25]:
model = fasttext.train_supervised(input=train_file, epoch=epoch)
acc = evaluate(model, valid_file)
model.save_model(f"model_epoch{epoch}.bin")
print(f"Epoch {epoch}: accuracy={acc}")for epoch in [5, 10, 15, 20, 25]:
model = fasttext.train_supervised(input=train_file, epoch=epoch)
acc = evaluate(model, valid_file)
model.save_model(f"model_epoch{epoch}.bin")
print(f"Epoch {epoch}: accuracy={acc}")import datetime
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
model.save_model(f"model_{timestamp}.bin")import datetime
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
model.save_model(f"model_{timestamp}.bin")START
│
▼
Run quick baseline (dim=50, epoch=5)
│
▼
Does baseline meet accuracy target?
│
├─ YES → Check size constraint
│ ├─ Meets size → DONE
│ └─ Exceeds size → Apply quantization or reduce dim
│
└─ NO → Increase model capacity
│
▼
Try: higher dim, more epochs, wordNgrams=2
│
▼
Does improved model meet accuracy?
├─ YES → Check size, apply compression if needed
└─ NO → Try autotune with validation file开始
│
▼
运行快速基线(dim=50, epoch=5)
│
▼
基线模型是否满足精度目标?
│
├─ 是 → 检查大小约束
│ ├─ 满足大小要求 → 完成
│ └─ 超出大小限制 → 应用量化或降低dim值
│
└─ 否 → 提升模型容量
│
▼
尝试:更高的dim值、更多迭代次数、wordNgrams=2
│
▼
优化后的模型是否满足精度要求?
├─ 是 → 检查大小,必要时应用压缩
└─ 否 → 使用带验证集的autotune功能undefinedundefinedundefinedundefined