ipynb-notebooks

IPYNB Notebook (.ipynb)

Overview

This skill guides you to operate

.ipynb

files and notebook projects in an "engineered" manner (not limited to Jupyter, also applicable to environments like Google Colab / VS Code Notebook):

Clear file structure: Notebook serves as the interface, with logic sunk into reusable
```
scripts/
```
and
```
lib/
```
Efficient token workflow: When AI reads/writes notebooks, only read structure/code as much as possible, not large outputs
Presentable mode: Structure and output specifications for demos, team sharing, and documentation
Reproducible environment: Prefer
```
uv
```
, or fall back to
```
venv
```
, to ensure repeatable execution

Applicable Scenarios

Use this skill in the following scenarios:

Creating a new notebook project or single notebook
Reviewing / editing existing
```
.ipynb
```
files (especially large files with many outputs and unreadable diffs)
Organizing notebook project structures, extracting "reusable logic" from notebooks into modules/scripts
Organizing "runnable, reproducible, exportable" notebooks for demos, sharing, and archiving
Improving long-term maintainability and version control experience of notebooks

Core Principles

Notebook is an interface, not a library.

Notebooks are suitable for interactive exploration and narrative presentation; reusable, testable, automatable logic should be placed in:

```
scripts/
```
: Directly runnable scripts (no dependency on notebook UI)
```
lib/
```
: Reusable modules (imported by both notebooks and scripts)

Benefits of this approach:

Reuse the same logic across multiple notebooks
Test key logic without running notebooks
Easier automation in CI/CD (e.g., export, scheduled data processing)
Cleaner diffs and more friendly version control

Quick Start

Create a new notebook project (uv recommended)

Initialize project (uv)

bash

# Create project directory
mkdir notebook-project && cd notebook-project

# Initialize uv project
uv init

# Add dependencies (pick what you need)
uv add jupyterlab pandas plotly

Set up directory structure

bash

mkdir -p scripts lib data/{raw,processed} reports docs .archive
touch data/.gitkeep data/raw/.gitkeep data/processed/.gitkeep reports/.gitkeep

Prepare
.gitignore
(example)

gitignore

# Virtual environments
.venv/

# Data and outputs (keep .gitkeep)
data/**
!data/**/
!data/**/.gitkeep
reports/**
!reports/**/
!reports/**/.gitkeep

# Jupyter
.ipynb_checkpoints/

# Python
__pycache__/
*.pyc

# Environment
.env

Start notebook environment
bash
```
uv run jupyter lab
```
Load reference documents when more detailed patterns are needed:
- ```
references/file-structure.md
```
  : Directory structure and project organization
- ```
references/presentation-patterns.md
```
  : Demonstration/sharing structure and output specifications
- ```
references/token-efficiency.md
```
  : Token efficiency strategies for AI reading/writing notebooks

Review / compare an existing notebook (focus on structure and code as much as possible)

Recommended workflow:

Check structure first, don't read outputs

bash

# Cell types and counts
jq '.cells | group_by(.cell_type) | map({type: .[0].cell_type, count: length})' notebook.ipynb

# Code cells with outputs
jq '[.cells[] | select(.cell_type == "code") | select(.outputs | length > 0)] | length' notebook.ipynb

Compare only code cells

bash

# Extract code sources to compare
jq '.cells[] | select(.cell_type == "code") | .source' notebook1.ipynb > /tmp/code1.json
jq '.cells[] | select(.cell_type == "code") | .source' notebook2.ipynb > /tmp/code2.json
diff /tmp/code1.json /tmp/code2.json

Read notebook content only when necessary
- Clarify which section or cell type to read before accessing
- For large notebooks, prefer segmented reading by cell range/topic
- Details in
```
references/token-efficiency.md
```

Organize a notebook project (extract logic, control outputs, make it reproducible)

Directory organization suggestions are in

references/file-structure.md

. Here are minimal executable migration steps:

Count root directory files:
```
ls -1 | wc -l
```
Move scripts to
```
scripts/
```
, documents to
```
docs/
```
, old notebooks to
```
.archive/
```
Update imports in notebooks:
```
from lib import module_name
```
Verify normal operation is still possible

Reproducible Environment (uv / venv)

Why prefer uv?

uv is suitable for:

Fast, reproducible dependency management
Running tools in project dependency environments (e.g.,
```
jupyter
```
,
```
nbconvert
```
)
No pollution to global Python
Better cross-platform consistency

Common command patterns

Add dependencies:

bash

uv add plotly pandas duckdb

Install tools (optional):

bash

uv tool install jupyterlab

Run in project environment:

bash

uv run jupyter lab

Single-file script dependency declaration (for
uv run
):

python

# /// script
# requires-python = ">=3.11"
# dependencies = [
#     "pandas",
#     "plotly",
# ]
# ///

import pandas as pd
import plotly.express as px

# Script code here

Run:

uv run script.py

If you can't use uv, you can also use

python -m venv .venv

pip

, but ensure one-click reproducibility (recommend

requirements.txt

pyproject.toml

+ lockfile).

Token Efficient Workflow (for AI and Version Control)

Default strategy: Clean outputs before committing

Recommended pre-commit:

yaml

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/kynan/nbstripout
    rev: 0.6.1
    hooks:
      - id: nbstripout

When outputs must be retained (not recommended):

bash

SKIP=nbstripout git commit -m "Add notebook with visualization outputs"

A more common practice is: save outputs to

reports/

, keep notebooks in a state where "re-running can reproduce outputs" (see

references/token-efficiency.md

Query before reading (structure first)

Check structure first:

bash

jq '.cells | group_by(.cell_type) | map({type: .[0].cell_type, count: length})' notebook.ipynb

View only code:

bash

jq '.cells[] | select(.cell_type == "code") | .source' notebook.ipynb

Output should be "controllable, reproducible"

Prefer output summaries, don't directly dump large objects:

python

print(f"[OK] Loaded {len(df_alarms):,} rows")
print(f"Columns: {', '.join(df_alarms.columns)}")
print(f"Date range: {df_alarms['timestamp'].min()} to {df_alarms['timestamp'].max()}")

Save large outputs to files:

python

fig.write_html(report_dir / "visualization.html")
print(f"[OK] Saved visualization to {report_dir}/visualization.html")

Complete strategies are in

references/token-efficiency.md

Demonstration / Sharing Mode

Recommended notebook structure

Title & Overview - Background and objectives
Preparation - Imports and configuration
Data Loading - With feedback and error handling
Summary - High-level statistics
Visualization - With explanations and usage tips
Conclusion - Key findings

More "professional" output habits

Unified status output:

python

print("[OK] Success")
print("[WARN] Warning")
print("[ERR] Error")
print("[INFO] Note")

Number formatting:

python

print(f"Total: {count:,}")  # 2,055 instead of 2055

Save to reports by date:

python

from datetime import datetime

today = datetime.now().strftime('%Y-%m-%d')
report_dir = Path("reports") / today
report_dir.mkdir(parents=True, exist_ok=True)

fig.write_html(report_dir / "chart.html")

latest = Path("reports/latest")
if latest.exists():
    latest.unlink()
latest.symlink_to(today, target_is_directory=True)

Complete patterns and templates are in

references/presentation-patterns.md

Resource Index

references/file-structure.md

Includes:

Recommended directory structure
File organization rules and naming conventions
Git-friendly practices (ignore, diff, output cleaning)
Migration steps for existing projects
Example structures

Suitable for: Loading when creating new projects, refactoring directories, unifying conventions.

references/token-efficiency.md

Includes:

Output cleaning and version control strategies
Structured query methods without reading outputs
Segmented reading and diff ideas for large notebooks
Common
```
jq
```
/ CLI patterns
Cell output management

Suitable for: Loading when token saving is needed, reviewing large notebooks, or performing automated processing.

references/presentation-patterns.md

Includes:

Structure templates for demonstration notebooks
Readability and narrative rhythm
Interactive elements and export strategies
Error handling and reproducibility checkpoints
Division of labor between Markdown / Code cells
Notes on exporting to HTML/PDF

Suitable for: Loading before creating demos, team sharing, or publishing documentation.

Best Practices Cheat Sheet

Structure: Notebook as interface, logic sunk into
```
scripts/
```
/
```
lib/
```
Dependencies: Prefer uv to ensure one-click reproducibility
Version Control: Clean outputs by default (pre-commit/nbstripout/nbconvert)
Token Saving: Query structure before reading; save large outputs to files
Presentation: Clear narrative, restrained outputs, explicit error handling
Reproducibility: Ensure "Restart & Run All" works
Data Flow: raw → processed → reports
Git-friendly: Ignore data and products, keep directory skeleton (
```
.gitkeep
```
)

Example Workflow

bash

# 1. Create project
mkdir my-analysis && cd my-analysis
uv init
uv add jupyterlab pandas plotly

# 2. Set up structure
mkdir -p scripts lib data/{raw,processed} reports
touch data/.gitkeep data/raw/.gitkeep data/processed/.gitkeep reports/.gitkeep

# 3. Create notebook
uv run jupyter lab

# 4. As you work:
# - Keep logic in lib/ and scripts/
# - Save outputs to reports/ with dates
# - Keep outputs minimal
# - Strip outputs before committing

# 5. Before presenting:
# - Run "Restart & Run All" to test
# - Add context and documentation
# - Consider exporting to HTML
jupyter nbconvert --to html --execute notebook.ipynb

Cheat Sheet

Directory Organization:

Notebooks: Project root (or split into
```
notebooks/
```
by scale)
Scripts:
```
scripts/
```
Modules:
```
lib/
```
Data:
```
data/raw/
```
,
```
data/processed/
```
Reports:
```
reports/YYYY-MM-DD/
```
Archive:
```
.archive/
```

Common uv Commands:

```
uv init
```
: Initialize project
```
uv add <package>
```
: Add dependencies
```
uv run <command>
```
: Run command in project environment
```
uvx <tool>
```
: Run temporary tool (not written to project dependencies)

Token Saving:

Clean outputs: pre-commit hook, or

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace notebook.ipynb

Query structure:
```
jq '.cells | group_by(.cell_type)'
```

Compare code:

jq '.cells[] | select(.cell_type == "code") | .source'

Presentation:

Number formatting:
```
{count:,}
```
Save to files by date:
```
reports/YYYY-MM-DD/
```
Execution verification:
```
jupyter nbconvert --execute
```

ipynb-notebooks

NPX Install

Tags

SKILL.md Content (Chinese)

IPYNB Notebook (.ipynb)

Overview

Applicable Scenarios

Core Principles

Quick Start

Create a new notebook project (uv recommended)

Review / compare an existing notebook (focus on structure and code as much as possible)

Organize a notebook project (extract logic, control outputs, make it reproducible)

Reproducible Environment (uv / venv)

Why prefer uv?

Common command patterns

Token Efficient Workflow (for AI and Version Control)

Default strategy: Clean outputs before committing

Query before reading (structure first)

Output should be "controllable, reproducible"

Demonstration / Sharing Mode

Recommended notebook structure

More "professional" output habits

Resource Index

references/file-structure.md

references/token-efficiency.md

references/presentation-patterns.md

Best Practices Cheat Sheet

Example Workflow

Cheat Sheet