sankey-diagram-creator

Original🇺🇸 English
Translated
1 scriptsChecked / no sensitive code detected

Create interactive Sankey diagrams for flow visualization from CSV, DataFrame, or dict data. Supports node/link styling and HTML/PNG/SVG export.

1installs
Added on

NPX Install

npx skill4agent add dkyazzentwatwa/chatgpt-skills sankey-diagram-creator

Sankey Diagram Creator

Create interactive Sankey diagrams to visualize flows, transfers, and relationships between categories. Perfect for budget flows, energy transfers, user journeys, and data pipelines.

Quick Start

python
from scripts.sankey_creator import SankeyCreator

# From dictionary
sankey = SankeyCreator()
sankey.from_dict({
    'source': ['A', 'A', 'B', 'B'],
    'target': ['C', 'D', 'C', 'D'],
    'value': [10, 20, 15, 25]
})
sankey.generate().save_html("flow.html")

# From CSV
sankey = SankeyCreator()
sankey.from_csv("flows.csv", source="from", target="to", value="amount")
sankey.title("Budget Flow").generate().save_html("budget.html")

Features

  • Multiple Input Sources: Dict, DataFrame, or CSV files
  • Interactive Output: Hover tooltips with flow details
  • Node Customization: Colors, labels, positions
  • Link Styling: Colors, opacity, labels
  • Export Formats: HTML (interactive), PNG, SVG
  • Auto-Coloring: Automatic color assignment or custom schemes

API Reference

Initialization

python
sankey = SankeyCreator()

Data Input Methods

python
# From dictionary
sankey.from_dict({
    'source': ['A', 'A', 'B'],
    'target': ['X', 'Y', 'X'],
    'value': [10, 20, 15]
})

# From CSV file
sankey.from_csv(
    filepath="data.csv",
    source="source_col",
    target="target_col",
    value="value_col",
    label=None  # Optional: column for link labels
)

# From pandas DataFrame
import pandas as pd
df = pd.DataFrame({
    'from': ['A', 'B'],
    'to': ['C', 'C'],
    'amount': [100, 200]
})
sankey.from_dataframe(df, source="from", target="to", value="amount")

# Add individual flows
sankey.add_flow("Source", "Target", 50)
sankey.add_flow("Source", "Other", 30, label="30 units")

Styling Methods

All methods return
self
for chaining.
python
# Node colors
sankey.node_colors({
    'Income': '#2ecc71',
    'Expenses': '#e74c3c',
    'Savings': '#3498db'
})

# Use colormap for automatic colors
sankey.node_colors(colormap='Pastel1')

# Link colors (by source, target, or custom)
sankey.link_colors(by='source')  # Color by source node
sankey.link_colors(by='target')  # Color by target node
sankey.link_colors(opacity=0.6)  # Set link opacity

# Custom link colors
sankey.link_colors(colors={
    ('Income', 'Expenses'): '#ff6b6b',
    ('Income', 'Savings'): '#4ecdc4'
})

# Title
sankey.title("My Flow Diagram")
sankey.title("Budget Overview", font_size=20)

# Layout
sankey.layout(orientation='h')  # Horizontal (default)
sankey.layout(orientation='v')  # Vertical
sankey.layout(pad=20)  # Node padding

Generation and Export

python
# Generate the diagram
sankey.generate()

# Save as interactive HTML
sankey.save_html("diagram.html")
sankey.save_html("diagram.html", auto_open=True)  # Open in browser

# Save as static image
sankey.save_image("diagram.png")
sankey.save_image("diagram.svg", format='svg')
sankey.save_image("diagram.png", width=1200, height=800)

# Get Plotly figure object for customization
fig = sankey.get_figure()
fig.update_layout(...)

# Show in notebook/browser
sankey.show()

Data Format

CSV Format

csv
source,target,value,label
Income,Housing,1500,Rent
Income,Food,600,Groceries
Income,Transport,400,Car
Income,Savings,500,Emergency Fund
Savings,Investments,300,Stocks

Dictionary Format

python
data = {
    'source': ['Income', 'Income', 'Income', 'Expenses'],
    'target': ['Rent', 'Food', 'Savings', 'Tax'],
    'value': [1500, 600, 500, 400],
    'label': ['Housing', 'Groceries', 'Emergency', 'Federal']  # Optional
}

DataFrame Format

python
import pandas as pd
df = pd.DataFrame({
    'from_node': ['A', 'A', 'B', 'C'],
    'to_node': ['B', 'C', 'D', 'D'],
    'flow_value': [100, 200, 150, 180]
})

Color Schemes

Available Colormaps

NameDescription
Pastel1
Soft pastel colors
Pastel2
Muted pastels
Set1
Bold distinct colors
Set2
Muted distinct colors
Set3
Light distinct colors
Paired
Paired color scheme
viridis
Blue-green-yellow
plasma
Purple-orange-yellow

Custom Colors

python
# Hex colors
sankey.node_colors({
    'Category A': '#1abc9c',
    'Category B': '#3498db',
    'Category C': '#9b59b6'
})

# Named colors
sankey.node_colors({
    'Revenue': 'green',
    'Costs': 'red',
    'Profit': 'blue'
})

CLI Usage

bash
# Basic usage
python sankey_creator.py --input flows.csv \
    --source from --target to --value amount \
    --output flow.html

# With title and colors
python sankey_creator.py --input budget.csv \
    --source category --target subcategory --value amount \
    --title "Budget Breakdown" \
    --colormap Pastel1 \
    --output budget.html

# PNG output
python sankey_creator.py --input data.csv \
    --source src --target dst --value val \
    --output diagram.png \
    --width 1200 --height 800

CLI Arguments

ArgumentDescriptionDefault
--input
Input CSV fileRequired
--source
Source column nameRequired
--target
Target column nameRequired
--value
Value column nameRequired
--output
Output file path
sankey.html
--title
Diagram titleNone
--colormap
Color scheme
Pastel1
--link-opacity
Link opacity (0-1)0.5
--orientation
'h' or 'v'
h
--width
Image width (PNG/SVG)1000
--height
Image height (PNG/SVG)600

Examples

Budget Flow

python
sankey = SankeyCreator()
sankey.from_dict({
    'source': ['Salary', 'Salary', 'Salary', 'Salary', 'Savings', 'Investments'],
    'target': ['Housing', 'Food', 'Transport', 'Savings', 'Investments', 'Stocks'],
    'value': [2000, 800, 500, 1000, 700, 700]
})
sankey.title("Monthly Budget Flow")
sankey.node_colors({
    'Salary': '#27ae60',
    'Housing': '#e74c3c',
    'Food': '#f39c12',
    'Transport': '#3498db',
    'Savings': '#9b59b6',
    'Investments': '#1abc9c',
    'Stocks': '#34495e'
})
sankey.generate().save_html("budget_flow.html")

Energy Flow

python
sankey = SankeyCreator()
sankey.add_flow("Coal", "Electricity", 100)
sankey.add_flow("Gas", "Electricity", 80)
sankey.add_flow("Solar", "Electricity", 30)
sankey.add_flow("Electricity", "Industry", 120)
sankey.add_flow("Electricity", "Residential", 60)
sankey.add_flow("Electricity", "Commercial", 30)

sankey.title("Energy Flow")
sankey.node_colors(colormap='Set2')
sankey.link_colors(by='source', opacity=0.4)
sankey.generate().save_html("energy.html")

User Journey

python
sankey = SankeyCreator()
sankey.from_dict({
    'source': ['Landing', 'Landing', 'Product', 'Product', 'Cart', 'Cart'],
    'target': ['Product', 'Exit', 'Cart', 'Exit', 'Checkout', 'Exit'],
    'value': [1000, 200, 600, 200, 400, 100]
})
sankey.title("User Journey")
sankey.node_colors({
    'Landing': '#3498db',
    'Product': '#2ecc71',
    'Cart': '#f1c40f',
    'Checkout': '#27ae60',
    'Exit': '#e74c3c'
})
sankey.generate().save_html("journey.html")

Multi-Level Flow

python
# Three-level hierarchy
data = {
    'source': [
        # Level 1 -> Level 2
        'Revenue', 'Revenue', 'Revenue',
        # Level 2 -> Level 3
        'Products', 'Products', 'Services', 'Services'
    ],
    'target': [
        'Products', 'Services', 'Other',
        'Electronics', 'Software', 'Consulting', 'Support'
    ],
    'value': [500, 300, 50, 300, 200, 200, 100]
}

sankey = SankeyCreator()
sankey.from_dict(data)
sankey.title("Revenue Breakdown")
sankey.generate().save_html("revenue.html")

Dependencies

plotly>=5.15.0
pandas>=2.0.0
kaleido>=0.2.0

Limitations

  • Static image export (PNG/SVG) requires
    kaleido
    package
  • Very complex diagrams with many nodes may be hard to read
  • Node positions are auto-calculated (limited manual control)
  • Link colors can only be uniform or by source/target