Loading...
Loading...
Validates dataset formatting and quality for SageMaker model fine-tuning (SFT, DPO, or RLVR). Use when the user says "is my dataset okay", "evaluate my data", "check my training data", "I have my own data", or before starting any fine-tuning job. Detects file format, checks schema compliance against the selected model and technique, and reports whether the data is ready for training or evaluation.
npx skill4agent add awslabs/agent-plugins dataset-evaluationreferences/strategy_data_requirements.md# With the file path argument identified in workflow step 1
python scripts/format_detector.py local_path/to/dataset