Loading...
Loading...
Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers publishing and sharing outputs (Streamlit, Observable, GitHub Pages).
npx skill4agent add withqwerty/nutmeg nutmeg-storedocs/accuracy-guardrail.mdsearch_docs.nutmeg.user.md/nutmeg| Format | Best for | Tools |
|---|---|---|
| JSON | Raw event data, API responses | Any language |
| CSV | Tabular stats, easy to share | Spreadsheets, pandas, R |
| Parquet | Columnar analytics, fast queries | polars, pandas, DuckDB, Arrow |
| SQLite | Relational queries, multiple tables | Any language, DB browser tools |
| Format | Best for | Notes |
|---|---|---|
| Parquet files | Analytics workloads | 5-10x smaller than JSON, fast columnar reads |
| DuckDB | SQL analytics on local files | Queries Parquet/CSV directly, no server needed |
| SQLite | Relational data with joins | Single file, portable, ACID compliant |
| Solution | Best for | Cost |
|---|---|---|
| PostgreSQL | Production apps, complex queries | Free (self-hosted) or ~$7/mo (Railway, Supabase) |
| BigQuery | Massive analytical queries | Free tier: 1TB/mo queries |
| Cloudflare R2 | Object storage (raw files) | Free tier: 10GB storage |
| S3 / GCS | Object storage at scale | ~$0.023/GB/mo |
project/
data/
raw/ # Untouched API/scrape responses
statsbomb/
events/
matches.json
fbref/
2024/
processed/ # Cleaned, transformed data
events.parquet
shots.parquet
passes.parquet
derived/ # Computed metrics
xg_model.parquet
passing_networks/
notebooks/ # Analysis notebooks
scripts/ # Data pipeline scripts
outputs/ # Charts, reports, exports
.env # API keys (gitignored)
.nutmeg.user.md # Nutmeg profile| Platform | Language | Cost | Notes |
|---|---|---|---|
| Streamlit | Python | Free (community cloud) | Most popular for football analytics. Deploy from GitHub |
| Observable | JavaScript | Free tier | Great for D3.js visualisations. Notebooks + Framework |
| Shiny | R | Free (shinyapps.io, 25 hrs/mo) | R ecosystem integration |
| Gradio | Python | Free (HuggingFace Spaces) | Quick ML model demos |
| Platform | Notes |
|---|---|
| GitHub Pages | Free. Good for static charts (D3, matplotlib exports) |
| Cloudflare Pages | Free. Faster, more features than GH Pages |
| Vercel | Free tier. Good for Next.js/Astro sites |
| Method | Best for |
|---|---|
| GitHub repo | Small datasets (< 100MB), code + data together |
| GitHub Releases | Larger files (up to 2GB per release) |
| Kaggle Datasets | Community sharing, discoverable, free |
| HuggingFace Datasets | ML-focused, versioned, free |
| Output | Tool | Notes |
|---|---|---|
| Static charts | matplotlib, ggplot2, D3.js | Export as PNG/SVG |
| Animated charts | matplotlib.animation, D3 transitions | Export as GIF/MP4 |
| Twitter/X threads | Chart images + alt text | Accessibility matters |
| Blog posts | Markdown + embedded charts | GitHub Pages, Medium, Substack |
.nutmeg.user.md