Loading...
Loading...
Transform CSV/Excel data into narrative reports with auto-generated insights, visualizations, and PDF export. Auto-detects patterns and creates plain-English summaries.
npx skill4agent add dkyazzentwatwa/chatgpt-skills data-storytellerfrom scripts.data_storyteller import DataStoryteller
# Initialize with your data file
storyteller = DataStoryteller("your_data.csv")
# Or from a pandas DataFrame
import pandas as pd
df = pd.read_csv("your_data.csv")
storyteller = DataStoryteller(df)# Generate comprehensive report
report = storyteller.generate_report()
# Access components
print(report['summary']) # Executive summary
print(report['insights']) # Key findings
print(report['statistics']) # Statistical analysis
print(report['visualizations']) # Generated chart info# Export to PDF
storyteller.export_pdf("analysis_report.pdf")
# Export to HTML (interactive charts)
storyteller.export_html("analysis_report.html")
# Export charts only
storyteller.export_charts("charts/", format="png")from scripts.data_storyteller import DataStoryteller
# One-liner full analysis
DataStoryteller("sales_data.csv").generate_report().export_pdf("report.pdf")storyteller = DataStoryteller("data.csv")
# Focus on specific columns
storyteller.analyze_columns(['revenue', 'customers', 'date'])
# Set analysis parameters
report = storyteller.generate_report(
include_correlations=True,
include_outliers=True,
include_trends=True,
time_column='date',
chart_style='business'
)| Data Type | Charts Generated |
|---|---|
| Numeric | Histogram, box plot, trend line |
| Categorical | Bar chart, pie chart, frequency table |
| Time Series | Line chart, decomposition, forecast |
| Correlations | Heatmap, scatter matrix |
| Comparisons | Grouped bar, stacked area |
# Available styles
styles = ['business', 'scientific', 'minimal', 'dark', 'colorful']
storyteller.generate_report(chart_style='business')storyteller = DataStoryteller(df)
# Configure analysis
storyteller.config.update({
'max_categories': 20, # Max categories to show
'outlier_method': 'iqr', # 'iqr', 'zscore', 'isolation'
'correlation_threshold': 0.5,
'significance_level': 0.05,
'date_format': 'auto', # Or specify like '%Y-%m-%d'
'language': 'en', # Narrative language
})| Format | Extension | Notes |
|---|---|---|
| CSV | .csv | Auto-detect delimiter |
| Excel | .xlsx, .xls | Multi-sheet support |
| JSON | .json | Records or columnar |
| Parquet | .parquet | For large datasets |
| TSV | .tsv | Tab-separated |
"This dataset contains 10,847 records across 15 columns, covering sales transactions from January 2023 to December 2024. Revenue shows a strong upward trend (+23% YoY) with clear seasonal peaks in Q4. The top 3 product categories account for 67% of total revenue. Notable finding: Customer acquisition cost has increased 15% while retention rate dropped 8%, suggesting potential profitability concerns worth investigating."
"Strong correlation detected between marketing_spend and new_customers (r=0.78, p<0.001). However, this relationship weakens significantly after $50K monthly spend, suggesting diminishing returns beyond this threshold."
pandas>=2.0.0
numpy>=1.24.0
matplotlib>=3.7.0
seaborn>=0.12.0
scipy>=1.10.0
reportlab>=4.0.0
openpyxl>=3.1.0