Caffe CIFAR-10 Build and Training
This skill provides procedural guidance for building the Caffe deep learning framework from source and training models on the CIFAR-10 dataset.
When to Use This Skill
- Building Caffe from source on Ubuntu/Debian systems
- Training CIFAR-10 or similar image classification models with Caffe
- Configuring Caffe for CPU-only execution
- Troubleshooting Caffe build and dependency issues
Critical Requirements Checklist
Before starting, identify ALL requirements from the task specification:
- Execution mode: CPU-only vs GPU (affects solver configuration)
- Iteration count: Specific number of training iterations required
- Output files: Where training logs and models should be saved
- Model checkpoints: Which iteration's model file is expected
Phase 1: Dependency Installation
System Dependencies
Install required packages before attempting to build:
bash
apt-get update && apt-get install -y \
build-essential cmake git \
libprotobuf-dev libleveldb-dev libsnappy-dev \
libhdf5-serial-dev protobuf-compiler \
libatlas-base-dev libgflags-dev libgoogle-glog-dev liblmdb-dev \
libopencv-dev libboost-all-dev \
python3-dev python3-numpy python3-pip
Verification Step
Confirm critical libraries are installed:
bash
dpkg -l | grep -E "libhdf5|libopencv|libboost"
Phase 2: Caffe Source Acquisition
Clone and Checkout
bash
git clone https://github.com/BVLC/caffe.git
cd caffe
git checkout 1.0 # Note: Tag is "1.0", not "1.0.0"
Common Mistake
The release tag is
, not
. Verify with
if uncertain.
Phase 3: Makefile.config Configuration
Create Configuration File
bash
cp Makefile.config.example Makefile.config
Essential Configuration Changes
Apply these modifications to
:
-
CPU-Only Mode (if no GPU available):
-
OpenCV Version (for OpenCV 3.x or 4.x):
Note: OpenCV 4 may require additional compatibility patches.
-
HDF5 Paths (Ubuntu-specific):
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial
-
Python Configuration (Python 3):
PYTHON_LIBRARIES := boost_python3 python3.8
PYTHON_INCLUDE := /usr/include/python3.8 /usr/lib/python3/dist-packages/numpy/core/include
Adjust version numbers based on installed Python version.
Configuration Verification
After editing, verify no duplicate definitions exist:
bash
grep -n "PYTHON_INCLUDE\|PYTHON_LIB\|CPU_ONLY" Makefile.config
Ensure each setting appears only once in an uncommented form.
Phase 4: Building Caffe
Memory-Aware Compilation
Avoid using all CPU cores on memory-constrained systems:
bash
# For systems with limited RAM (< 8GB)
make all -j2
# For systems with adequate RAM
make all -j$(nproc)
Build Failure Recovery
If the build fails or is killed (often due to memory):
-
Clean the build:
-
Rebuild with reduced parallelism:
Build Verification
Confirm the binary exists after build:
bash
ls -la .build_release/tools/caffe.bin
# or for CPU-only builds:
ls -la .build_release/tools/caffe
Phase 5: Dataset Preparation
Download CIFAR-10
bash
./data/cifar10/get_cifar10.sh
Convert to LMDB Format
bash
./examples/cifar10/create_cifar10.sh
Verification
Confirm LMDB directories exist:
bash
ls -la examples/cifar10/cifar10_train_lmdb
ls -la examples/cifar10/cifar10_test_lmdb
Phase 6: Solver Configuration
Modify Solver for Requirements
Edit
examples/cifar10/cifar10_quick_solver.prototxt
:
-
Set iteration count:
max_iter: 500 # Or as specified in task
-
Set execution mode:
solver_mode: CPU # Change from GPU if required
Verification
bash
grep -E "max_iter|solver_mode" examples/cifar10/cifar10_quick_solver.prototxt
Phase 7: Training Execution
Run Training with Output Capture
bash
./build/tools/caffe train \
--solver=examples/cifar10/cifar10_quick_solver.prototxt \
2>&1 | tee training_output.txt
Alternative Binary Paths
Depending on build configuration, the binary may be at:
.build_release/tools/caffe
.build_release/tools/caffe.bin
Phase 8: Verification
Required Outputs Checklist
-
Caffe binary exists:
bash
test -f .build_release/tools/caffe && echo "OK" || echo "MISSING"
-
Model file exists (iteration-specific):
bash
ls -la examples/cifar10/cifar10_quick_iter_*.caffemodel
-
Training output captured:
bash
test -f training_output.txt && echo "OK" || echo "MISSING"
-
Solver configured correctly:
bash
grep "solver_mode: CPU" examples/cifar10/cifar10_quick_solver.prototxt
Common Pitfalls
1. Premature Termination
Never stop after
or intermediate steps. Complete the full workflow:
Dependencies -> Build -> Dataset -> Configure -> Train -> Verify
2. Missing Solver Configuration
The solver file must be modified for:
- CPU vs GPU execution mode
- Specific iteration count requirements
3. Skipping Dataset Preparation
Training will fail without LMDB data. Always run both:
4. Build Parallelism Issues
High parallelism (
) can exhaust memory. Start with
on constrained systems.
5. Duplicate Configuration Entries
Multiple edits to
can create duplicate definitions. Always verify single definitions for each setting.
6. Wrong Git Tag
Use
not
for the stable release.
Decision Framework
When encountering issues:
- Build killed: Reduce parallelism, run , rebuild with
- Missing headers: Check HDF5 and OpenCV include paths in Makefile.config
- Python errors: Verify Python version matches configuration
- Training fails immediately: Check dataset preparation completed
- Wrong output location: Verify solver paths and output file redirection