Automatic Storytelling from Data - Turn raw data into compelling business narratives

These details have not been verified by PyPI

Project links

Project description

📊 DataStory - Automatic Storytelling from Data

Turn raw data into compelling business narratives automatically.

DataStory analyzes your datasets and generates full written reports with insights, trends, and recommendations - no LLMs needed, pure Python intelligence.

🚀 The Problem

Dashboards don't explain insights - They show graphs, not stories
People want narratives - Business stakeholders need context, not just charts
Manual analysis takes time - Writing reports is tedious and repetitive
Insights get lost - Important patterns buried in spreadsheets

💡 The Solution

DataStory automatically:

✅ Analyzes your data for trends, patterns, and anomalies
✅ Generates natural language business narratives
✅ Identifies risks and opportunities
✅ Provides actionable recommendations
✅ Exports to text, markdown, HTML, or PDF

All with a single line of code!

📦 Installation

pip install datastory

For full features (charts, Excel, PDF):

pip install datastory[full]

🎯 Quick Start

One-Line Magic

from datastory import narrate

report = narrate("sales.csv")
print(report)

Output:

📊 EXECUTIVE SUMMARY
==================================================
Analyzed 1,247 records across 8 dimensions.

🟡 3 high-priority insights identified.

Key Highlights:
1. Sales increased by 12.3% from $450,000 to $505,000.
2. Customer churn rose in April by 8.5%, requiring attention.
3. West Africa region dominates sales, accounting for 45.2% of revenue.

📈 KEY FINDINGS
==================================================

**Performance Trends:**
• Sales Shows Strong Growth: Sales increased by 12.3% from $450,000 to $505,000.
• Revenue per Customer Rising: Average order value grew by 15.7%.

**Notable Anomalies:**
• Unusual Values Detected in Order Quantity: Found 23 outliers (1.8% of data).

**Relationships Discovered:**
• Strong Positive Link: Marketing Spend and Revenue move together (correlation: 0.85).

🔍 DETAILED ANALYSIS
==================================================

**High-Priority Insights:**

🟡 Customer Churn Rising
   Customer churn increased by 8.5% in April. This represents a significant concern.

🟡 Low Stock Risk: Product X
   Minimum inventory is 12 units, significantly below average of 150. Consider restocking.

💡 RECOMMENDATIONS
==================================================
1. Investigate the decline in customer retention and implement recovery strategies
2. Capitalize on the growth in revenue per customer to maximize returns
3. Replenish product_x inventory to avoid stockouts
4. Review outliers in order quantity to identify root causes
5. Leverage identified relationships between metrics for predictive insights

==================================================
Report generated on December 03, 2025 at 1:20 PM
Powered by DataStory - Automatic Storytelling from Data

🔥 Key Features

1. Pure Python Intelligence

No LLMs or AI APIs required
Works offline
Fast and deterministic
Zero-cost analysis

2. Comprehensive Analysis

Statistical summaries
Trend detection
Anomaly identification
Correlation discovery
Time series patterns
Risk assessment

3. Natural Language Output

Business-friendly narratives
Context-aware descriptions
Action-oriented recommendations
Multiple detail levels

4. Flexible Export

from datastory import DataStory

story = DataStory()
story.load("data.csv")

# Export to different formats
story.export("report.txt", format="text")
story.export("report.md", format="markdown")
story.export("report.html", format="html", include_charts=True)
story.export("report.pdf", format="pdf")

5. Multiple Data Sources

# CSV, Excel, JSON, Parquet
story.load("sales.csv")
story.load("data.xlsx")
story.load("records.json")
story.load("dataset.parquet")

# URLs
story.load("https://example.com/data.csv")

# Pandas DataFrames
import pandas as pd
df = pd.read_sql("SELECT * FROM sales", conn)
story.load(df)

📖 Advanced Usage

Customization

from datastory import DataStory

# Configure narrative style
config = {
    "style": "business",  # business, casual, technical
    "detail_level": "detailed",  # brief, medium, detailed
    "include_recommendations": True
}

story = DataStory(config=config)
story.load("sales.csv")
narrative = story.generate_narrative()
print(narrative)

Programmatic Access

# Access insights directly
story = DataStory()
story.load("data.csv")

insights = story.extract_insights()
for insight in insights:
    print(f"{insight.type}: {insight.title}")
    print(f"Priority: {insight.priority}")
    print(f"Description: {insight.description}\n")

Analysis Results

# Get raw analysis results
story = DataStory()
story.load("data.csv")

results = story.analyze()
print(results["trends"])
print(results["anomalies"])
print(results["correlations"])

🎓 Use Cases

1. Business Intelligence

Generate executive summaries from sales, marketing, or financial data.

2. Data Science Reports

Automatically document exploratory data analysis (EDA) findings.

3. Automated Monitoring

Create daily/weekly reports on KPIs and metrics.

4. Client Reporting

Transform raw analytics into client-ready narratives.

5. Academic Research

Quickly summarize dataset characteristics and patterns.

🆚 Why DataStory?

Feature	DataStory	Traditional BI	LLM-based
Setup Time	Instant	Hours/Days	API setup
Cost	Free	$$$$	$$$ per call
Offline Use	✅ Yes	❌ No	❌ No
Customizable	✅ Full control	⚠️ Limited	❌ Black box
Speed	⚡ Instant	🐌 Slow	⏳ API delays
Privacy	🔒 Local	⚠️ Cloud	❌ Sent to API
Deterministic	✅ Yes	✅ Yes	❌ No

📊 Example Datasets

The examples/ directory includes sample datasets:

sales.csv - Sales performance data
customer_churn.csv - Customer retention data
inventory.csv - Stock levels and products

🛠️ Technical Details

Architecture

Core Analyzer: Statistical analysis using pandas/numpy
Insight Extractor: Pattern recognition and business logic
Narrative Generator: Template-based natural language generation
Data Loaders: Multi-format support (CSV, Excel, JSON, Parquet)
Report Formatters: Export to text, markdown, HTML, PDF

Dependencies

Core: pandas, numpy
Optional: matplotlib (charts), openpyxl (Excel), reportlab (PDF)

Performance

Analyzes 100K rows in <2 seconds
Generates narrative in <1 second
Low memory footprint

🤝 Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Submit a pull request

📝 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Built with ❤️ by Idriss Bado

Inspired by the need for better data communication in business.

📧 Contact

GitHub: @idrissbado
PyPI: datastory

⭐ Star this repo if you find it useful!

🐛 Found a bug? Open an issue

💡 Have an idea? Start a discussion

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Dec 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datastory_ai-0.1.0.tar.gz (31.2 kB view details)

Uploaded Dec 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

datastory_ai-0.1.0-py3-none-any.whl (26.3 kB view details)

Uploaded Dec 3, 2025 Python 3

File details

Details for the file datastory_ai-0.1.0.tar.gz.

File metadata

Download URL: datastory_ai-0.1.0.tar.gz
Upload date: Dec 3, 2025
Size: 31.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for datastory_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`77f35f3f3afefb1d7e4fd121f3534e075e9f052c5d1dc28831a9f769e069f120`
MD5	`954cd7765396684e009e331137953da0`
BLAKE2b-256	`7f17b2b044b3a1c046d9b62e2695335662e7941705a995687eef1db788a7a6ac`

See more details on using hashes here.

File details

Details for the file datastory_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: datastory_ai-0.1.0-py3-none-any.whl
Upload date: Dec 3, 2025
Size: 26.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for datastory_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fed442561797891b198ce85c2719188d8afeaa8f3f9abf3def6982e012066cfa`
MD5	`830424b136e189fce3a4adf22d26b1af`
BLAKE2b-256	`2ad52e1282ce47791ddb0e3308d7963a982d8ecb6aa7a5d8b27dcd478e6ac619`

See more details on using hashes here.

datastory-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

📊 DataStory - Automatic Storytelling from Data

🚀 The Problem

💡 The Solution

📦 Installation

🎯 Quick Start

One-Line Magic

🔥 Key Features

1. Pure Python Intelligence

2. Comprehensive Analysis

3. Natural Language Output

4. Flexible Export

5. Multiple Data Sources

📖 Advanced Usage

Customization

Programmatic Access

Analysis Results

🎓 Use Cases

1. Business Intelligence

2. Data Science Reports

3. Automated Monitoring

4. Client Reporting

5. Academic Research

🆚 Why DataStory?

📊 Example Datasets

🛠️ Technical Details

Architecture

Dependencies

Performance

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes