nutmeg-store

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Store

存储

Help the user choose storage formats, locations, and publishing methods for their football data.
帮助用户为其足球数据选择存储格式、存储位置以及发布方法。

Accuracy

准确性

Read and follow
docs/accuracy-guardrail.md
before answering any question about provider-specific facts (IDs, endpoints, schemas, coordinates, rate limits). Always use
search_docs
— never guess from training data.
回答任何与特定服务商相关的事实类问题(ID、endpoints、schema、坐标、速率限制)前,请先阅读并遵循
docs/accuracy-guardrail.md
的要求。始终使用
search_docs
工具查询,绝不要基于训练数据猜测答案。

First: check profile

第一步:检查配置文件

Read
.nutmeg.user.md
. If it doesn't exist, tell the user to run
/nutmeg
first.
读取
.nutmeg.user.md
文件。如果该文件不存在,请告知用户先运行
/nutmeg
命令。

Storage format decision tree

存储格式决策树

Small projects (< 100MB, single user)

小型项目(< 100MB,单人使用)

FormatBest forTools
JSONRaw event data, API responsesAny language
CSVTabular stats, easy to shareSpreadsheets, pandas, R
ParquetColumnar analytics, fast queriespolars, pandas, DuckDB, Arrow
SQLiteRelational queries, multiple tablesAny language, DB browser tools
Recommendation: Start with JSON for raw data, Parquet for processed data.
格式适用场景工具
JSON原始事件数据、API响应所有编程语言
CSV表格型统计数据,易于分享电子表格、pandas、R
Parquet列式分析,快速查询polars、pandas、DuckDB、Arrow
SQLite关系型查询,多表场景所有编程语言、数据库浏览器工具
推荐:原始数据采用JSON存储,处理后的数据采用Parquet存储。

Medium projects (100MB - 10GB)

中型项目(100MB - 10GB)

FormatBest forNotes
Parquet filesAnalytics workloads5-10x smaller than JSON, fast columnar reads
DuckDBSQL analytics on local filesQueries Parquet/CSV directly, no server needed
SQLiteRelational data with joinsSingle file, portable, ACID compliant
Recommendation: Parquet for storage, DuckDB for querying.
格式适用场景说明
Parquet 文件分析类工作负载比JSON小5-10倍,列式读取速度快
DuckDB本地文件的SQL分析可直接查询Parquet/CSV,无需服务器
SQLite需关联查询的关系型数据单文件、可移植、符合ACID规范
推荐:用Parquet存储数据,DuckDB做查询。

Large projects (> 10GB, multiple users)

大型项目(> 10GB,多用户)

SolutionBest forCost
PostgreSQLProduction apps, complex queriesFree (self-hosted) or ~$7/mo (Railway, Supabase)
BigQueryMassive analytical queriesFree tier: 1TB/mo queries
Cloudflare R2Object storage (raw files)Free tier: 10GB storage
S3 / GCSObject storage at scale~$0.023/GB/mo
解决方案适用场景成本
PostgreSQL生产级应用、复杂查询免费(自托管)或约7美元/月(Railway、Supabase)
BigQuery海量分析查询免费额度:每月1TB查询量
Cloudflare R2对象存储(原始文件)免费额度:10GB存储
S3 / GCS大规模对象存储约0.023美元/GB/月

Directory structure

目录结构

Recommend this structure for football data projects:
project/
  data/
    raw/                  # Untouched API/scrape responses
      statsbomb/
        events/
        matches.json
      fbref/
        2024/
    processed/            # Cleaned, transformed data
      events.parquet
      shots.parquet
      passes.parquet
    derived/              # Computed metrics
      xg_model.parquet
      passing_networks/
  notebooks/              # Analysis notebooks
  scripts/                # Data pipeline scripts
  outputs/                # Charts, reports, exports
  .env                    # API keys (gitignored)
  .nutmeg.user.md         # Nutmeg profile
推荐足球数据项目采用如下目录结构:
project/
  data/
    raw/                  # 未改动的API/爬取响应
      statsbomb/
        events/
        matches.json
      fbref/
        2024/
    processed/            # 清洗、转换后的数据
      events.parquet
      shots.parquet
      passes.parquet
    derived/              # 计算得出的指标
      xg_model.parquet
      passing_networks/
  notebooks/              # 分析笔记
  scripts/                # 数据管道脚本
  outputs/                # 图表、报告、导出文件
  .env                    # API密钥(git忽略)
  .nutmeg.user.md         # Nutmeg配置文件

Publishing and sharing

发布与分享

Interactive dashboards

交互式仪表盘

PlatformLanguageCostNotes
StreamlitPythonFree (community cloud)Most popular for football analytics. Deploy from GitHub
ObservableJavaScriptFree tierGreat for D3.js visualisations. Notebooks + Framework
ShinyRFree (shinyapps.io, 25 hrs/mo)R ecosystem integration
GradioPythonFree (HuggingFace Spaces)Quick ML model demos
平台编程语言成本说明
StreamlitPython免费(社区云)足球分析领域最受欢迎,可从GitHub直接部署
ObservableJavaScript有免费额度适合D3.js可视化,支持笔记本 + 框架
ShinyR免费(shinyapps.io,每月25小时使用时长)可对接R生态系统
GradioPython免费(HuggingFace Spaces)可快速搭建ML模型演示

Static sites

静态站点

PlatformNotes
GitHub PagesFree. Good for static charts (D3, matplotlib exports)
Cloudflare PagesFree. Faster, more features than GH Pages
VercelFree tier. Good for Next.js/Astro sites
平台说明
GitHub Pages免费,适合静态图表(D3、matplotlib导出)
Cloudflare Pages免费,比GitHub Pages速度更快、功能更多
Vercel有免费额度,适合Next.js/Astro站点

Sharing data

数据分享

MethodBest for
GitHub repoSmall datasets (< 100MB), code + data together
GitHub ReleasesLarger files (up to 2GB per release)
Kaggle DatasetsCommunity sharing, discoverable, free
HuggingFace DatasetsML-focused, versioned, free
方式适用场景
GitHub 仓库小型数据集(< 100MB),代码与数据统一存放
GitHub Releases更大的文件(每个版本最多2GB)
Kaggle Datasets社区分享,易被发现,免费
HuggingFace Datasets聚焦ML场景,支持版本管理,免费

Social media / content

社交媒体/内容

OutputToolNotes
Static chartsmatplotlib, ggplot2, D3.jsExport as PNG/SVG
Animated chartsmatplotlib.animation, D3 transitionsExport as GIF/MP4
Twitter/X threadsChart images + alt textAccessibility matters
Blog postsMarkdown + embedded chartsGitHub Pages, Medium, Substack
输出类型工具说明
静态图表matplotlib、ggplot2、D3.js导出为PNG/SVG格式
动态图表matplotlib.animation、D3 transitions导出为GIF/MP4格式
Twitter/X 帖子串图表+替代文本需注意无障碍访问
博客文章Markdown + 内嵌图表可发布在GitHub Pages、Medium、Substack

Cost awareness

成本提示

Based on the user's
.nutmeg.user.md
goals, flag costs:
  • Exploration/learning: Everything can be free. StatsBomb open data + Jupyter/Colab + GitHub Pages.
  • Content creation: Streamlit Community Cloud is free. Cloudflare Pages is free.
  • Professional: Budget for API access ($100-1000+/mo for Opta/StatsBomb commercial).
  • Product: Database hosting ($7-50/mo), consider data licensing costs separately.
根据用户
.nutmeg.user.md
中记录的目标,提示相关成本:
  • 探索/学习:所有工具都可免费使用。StatsBomb公开数据 + Jupyter/Colab + GitHub Pages即可满足需求。
  • 内容创作:Streamlit Community Cloud免费,Cloudflare Pages免费。
  • 专业用途:预留API访问预算(Opta/StatsBomb商业版每月100-1000+美元不等)。
  • 产品级应用:数据库托管费用每月7-50美元,数据授权成本需单独核算。