SUTRA: Structured-Unstructured-Text-Retrieval-Architecture | Natural Language to SQL with MySQL/PostgreSQL export
Project description
🚀 QuerySUTRA v0.1.3
Structured-Unstructured-Text-Retrieval-Architecture
Natural Language to SQL with Cloud Export | PDF, DOCX, TXT Support
A comprehensive Python library that converts natural language questions into SQL queries, with support for multiple file formats and cloud database export.
✨ Key Features
✅ Natural Language to SQL - Ask questions in plain English
✅ Multiple Formats - CSV, Excel, JSON, SQL, PDF, DOCX, TXT, DataFrame
✅ Cloud Export - MySQL, PostgreSQL (local & cloud)
✅ Direct SQL - No API cost option
✅ Auto Visualization - Plotly/Matplotlib charts
✅ Interactive Mode - Ask user for visualization choice
✅ Complete Backup - Export to SQLite, JSON, Excel
✅ Jupyter Ready - Perfect for notebooks
📦 Installation
# Basic installation
pip install QuerySUTRA
# With MySQL support
pip install QuerySUTRA[mysql]
# With PostgreSQL support
pip install QuerySUTRA[postgres]
# With all database support
pip install QuerySUTRA[all]
🎯 Quick Start
from sutra import SUTRA
# Initialize with OpenAI API key
sutra = SUTRA(api_key="your-openai-key")
# Upload any format
sutra.upload("data.csv") # CSV
sutra.upload("report.pdf") # PDF ✨
sutra.upload("doc.docx") # Word ✨
sutra.upload("data.xlsx") # Excel
sutra.upload(dataframe) # DataFrame
# Query with natural language
result = sutra.ask("What are the top 5 products?", viz=True)
print(result.data)
# Export to cloud
sutra.save_to_mysql("localhost", "root", "pass", "mydb")
sutra.save_to_postgres("host", "user", "pass", "db")
# Complete backup
sutra.backup()
📄 Supported File Formats
| Format | Extension | Example |
|---|---|---|
| CSV | .csv |
sutra.upload("data.csv") |
| Excel | .xlsx, .xls |
sutra.upload("data.xlsx") |
| JSON | .json |
sutra.upload("data.json") |
| SQL | .sql |
sutra.upload("schema.sql") |
.pdf |
sutra.upload("report.pdf") ✨ |
|
| Word | .docx |
sutra.upload("document.docx") ✨ |
| Text | .txt |
sutra.upload("data.txt") ✨ |
| DataFrame | pd.DataFrame |
sutra.upload(df, name="sales") |
🔥 New in v0.1.3
1. PDF Support
# Upload PDF files
sutra.upload("annual_report.pdf")
# Query the content
result = sutra.ask("What are the key findings in this report?")
print(result.data)
2. Word Document Support
# Upload DOCX files with tables
sutra.upload("sales_report.docx")
# Query the data
result = sutra.ask("Show me sales by region", viz=True)
3. Cloud Database Export
MySQL (Local or Cloud)
# Local MySQL
sutra.save_to_mysql("localhost", "root", "password", "mydb")
# AWS RDS MySQL
sutra.save_to_mysql(
host="mydb.xxxx.us-east-1.rds.amazonaws.com",
user="admin",
password="cloudpass",
database="production"
)
# Google Cloud SQL
sutra.save_to_mysql(
host="35.123.456.789",
user="admin",
password="pass",
database="mydb"
)
PostgreSQL (Local or Cloud)
# Local PostgreSQL
sutra.save_to_postgres("localhost", "postgres", "password", "mydb")
# Heroku PostgreSQL
sutra.save_to_postgres(
host="ec2-xxx.compute-1.amazonaws.com",
user="user",
password="pass",
database="dbname"
)
# AWS RDS PostgreSQL
sutra.save_to_postgres(
host="mydb.xxxx.us-west-2.rds.amazonaws.com",
user="admin",
password="pass",
database="prod"
)
4. Complete Export & Backup
# Export entire database
sutra.export_db("backup.db", format="sqlite")
sutra.export_db("dump.sql", format="sql")
sutra.export_db("data.json", format="json")
sutra.export_db("data.xlsx", format="excel")
# Export schema only
sutra.save_schema("schema.sql", format="sql")
sutra.save_schema("schema.json", format="json")
sutra.save_schema("schema.md", format="markdown")
# Complete backup (creates 3 files)
sutra.backup() # Creates .db, .sql, .json files with timestamp
📖 Complete Examples
Example 1: PDF Analysis
from sutra import SUTRA
sutra = SUTRA(api_key="your-openai-key")
# Upload PDF
sutra.upload("financial_report.pdf")
# View extracted data
sutra.peek(n=10)
# Query the content
result = sutra.ask("What are the total revenues?")
print(result.data)
# Visualize
result = sutra.ask("Show revenue by quarter", viz=True)
Example 2: Multi-Format Analysis
sutra = SUTRA(api_key="your-key")
# Upload multiple formats
sutra.upload("sales.csv")
sutra.upload("report.docx")
sutra.upload("data.xlsx")
# List all tables
print(sutra.tables())
# Query across data
result = sutra.ask("What are total sales?")
print(result.data)
Example 3: Cloud Deployment
# Analyze in Colab/Jupyter
sutra = SUTRA(api_key="your-key")
sutra.upload("local_analysis.csv")
# Query and analyze
result = sutra.ask("Show top performers", viz=True)
# Deploy to production MySQL
sutra.save_to_mysql(
host="production.mysql.com",
user="admin",
password="prod_password",
database="analytics_db"
)
# Backup everything
sutra.backup("/backups")
Example 4: Direct SQL (No API Cost)
# Execute SQL directly - FREE!
result = sutra.sql("""
SELECT region,
SUM(sales) as total_sales,
AVG(sales) as avg_sales
FROM sales_data
GROUP BY region
ORDER BY total_sales DESC
""")
print(result.data)
Example 5: Interactive Mode
# Ask user for visualization preference
result = sutra.interactive("What are sales trends?")
# Prompts: "Do you want visualization? (yes/no):"
if result.success:
print(result.data)
🛠️ API Reference
Initialization
sutra = SUTRA(api_key="your-openai-key", db="sutra.db")
Upload Data
sutra.upload(data, name="table_name")
# data = file path (str) or DataFrame
View Database
sutra.tables() # List all tables
sutra.schema() # Show database schema
sutra.peek(n=10) # Preview data
Query Data
# Direct SQL (no API cost)
result = sutra.sql("SELECT * FROM table", viz=False)
# Natural language (uses API)
result = sutra.ask("question", viz=False)
# Interactive (prompts user)
result = sutra.interactive("question")
Export & Backup
# Export results
sutra.export(dataframe, "output.csv", format="csv")
# Export database
sutra.export_db("backup.db", format="sqlite")
# Save to cloud
sutra.save_to_mysql(host, user, password, database)
sutra.save_to_postgres(host, user, password, database)
# Complete backup
sutra.backup("/backup/path")
QueryResult Object
result.success # bool - query succeeded
result.sql # str - generated SQL
result.data # DataFrame - query results
result.viz # figure - visualization (if viz=True)
result.error # str - error message (if failed)
💡 Use Cases
Data Analysis
sutra.upload("sales_data.csv")
result = sutra.ask("What products have declining sales?", viz=True)
Document Processing
sutra.upload("contract.pdf")
result = sutra.ask("What are the key terms and dates?")
Multi-Source Integration
sutra.upload("sales.csv")
sutra.upload("inventory.xlsx")
sutra.upload("report.docx")
result = sutra.ask("Combine all data sources")
Cloud Migration
# Local analysis
sutra.upload("data.csv")
result = sutra.ask("Analyze trends")
# Deploy to cloud
sutra.save_to_postgres("cloud-db.com", "user", "pass", "prod")
🎨 Features Comparison
| Feature | Available | Cost |
|---|---|---|
| CSV/Excel/JSON Upload | ✅ | Free |
| PDF Upload | ✅ | Free |
| DOCX Upload | ✅ | Free |
| Direct SQL Queries | ✅ | Free |
| Natural Language Queries | ✅ | ~$0.001/query |
| Visualization | ✅ | Free |
| MySQL Export | ✅ | Free |
| PostgreSQL Export | ✅ | Free |
| Backup & Export | ✅ | Free |
💰 Cost Optimization
# FREE - Direct SQL (no API calls)
result = sutra.sql("SELECT * FROM data WHERE sales > 1000")
# PAID - Natural language (uses OpenAI API)
result = sutra.ask("Show products with sales over 1000")
# Tip: Use direct SQL when you know the query!
🧪 Testing
# Install
pip install QuerySUTRA
# Test
python -c "from sutra import SUTRA; print('✅ Success!')"
📚 Documentation
- Full Guide: See
SUTRA_Complete_Guide.ipynb - Publishing: See
PUBLISHING_GUIDE.md - Examples: See
complete_example.py
🤝 Contributing
Contributions welcome! The main code is in sutra/sutra.py - a single, well-documented file.
📄 License
MIT License - Free to use in your projects!
🏆 Why QuerySUTRA?
- SUTRA = Structured-Unstructured-Text-Retrieval-Architecture
- Single-file design for simplicity
- Production-ready with error handling
- Cloud-native with export capabilities
- Comprehensive format support (PDF, DOCX, CSV, Excel, JSON)
- Cost-effective with free SQL mode
🌟 Credits
Author: Aditya Batta
Version: 0.1.3
License: MIT
📞 Support
- Issues: GitHub Issues
- PyPI: https://pypi.org/project/QuerySUTRA/
Made with ❤️ for data analysts and developers
Start analyzing with natural language today! 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file querysutra-0.1.4.tar.gz.
File metadata
- Download URL: querysutra-0.1.4.tar.gz
- Upload date:
- Size: 43.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f78ef7b1c0222e338cb133ad99cea783b38805a7280c1eaaf915f7cc3bdec144
|
|
| MD5 |
4a517ecb31e3609117e6c66b022799fe
|
|
| BLAKE2b-256 |
dd438fba16a00691a8f5645d9f949800001a4a76e07673d100304ca72bed7860
|
File details
Details for the file querysutra-0.1.4-py3-none-any.whl.
File metadata
- Download URL: querysutra-0.1.4-py3-none-any.whl
- Upload date:
- Size: 46.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6a0800935da674ac34e9dd340f79665c7ec5da8c36088dd580177f06d5415ef
|
|
| MD5 |
bdff8967cdc75a594701599191af5276
|
|
| BLAKE2b-256 |
8a7883097ee5f69fcded5ff919577e1e25b7b94d54f52b303bca0e9bf2ae0854
|