Skip to main content

SUTRA: Structured-Unstructured-Text-Retrieval-Architecture | Natural Language to SQL with MySQL/PostgreSQL export

Project description

🚀 QuerySUTRA v0.1.3

Structured-Unstructured-Text-Retrieval-Architecture

Natural Language to SQL with Cloud Export | PDF, DOCX, TXT Support

A comprehensive Python library that converts natural language questions into SQL queries, with support for multiple file formats and cloud database export.


✨ Key Features

Natural Language to SQL - Ask questions in plain English
Multiple Formats - CSV, Excel, JSON, SQL, PDF, DOCX, TXT, DataFrame
Cloud Export - MySQL, PostgreSQL (local & cloud)
Direct SQL - No API cost option
Auto Visualization - Plotly/Matplotlib charts
Interactive Mode - Ask user for visualization choice
Complete Backup - Export to SQLite, JSON, Excel
Jupyter Ready - Perfect for notebooks


📦 Installation

# Basic installation
pip install QuerySUTRA

# With MySQL support
pip install QuerySUTRA[mysql]

# With PostgreSQL support
pip install QuerySUTRA[postgres]

# With all database support
pip install QuerySUTRA[all]

🎯 Quick Start

from sutra import SUTRA

# Initialize with OpenAI API key
sutra = SUTRA(api_key="your-openai-key")

# Upload any format
sutra.upload("data.csv")      # CSV
sutra.upload("report.pdf")    # PDF ✨
sutra.upload("doc.docx")      # Word ✨
sutra.upload("data.xlsx")     # Excel
sutra.upload(dataframe)       # DataFrame

# Query with natural language
result = sutra.ask("What are the top 5 products?", viz=True)
print(result.data)

# Export to cloud
sutra.save_to_mysql("localhost", "root", "pass", "mydb")
sutra.save_to_postgres("host", "user", "pass", "db")

# Complete backup
sutra.backup()

📄 Supported File Formats

Format Extension Example
CSV .csv sutra.upload("data.csv")
Excel .xlsx, .xls sutra.upload("data.xlsx")
JSON .json sutra.upload("data.json")
SQL .sql sutra.upload("schema.sql")
PDF .pdf sutra.upload("report.pdf")
Word .docx sutra.upload("document.docx")
Text .txt sutra.upload("data.txt")
DataFrame pd.DataFrame sutra.upload(df, name="sales")

🔥 New in v0.1.3

1. PDF Support

# Upload PDF files
sutra.upload("annual_report.pdf")

# Query the content
result = sutra.ask("What are the key findings in this report?")
print(result.data)

2. Word Document Support

# Upload DOCX files with tables
sutra.upload("sales_report.docx")

# Query the data
result = sutra.ask("Show me sales by region", viz=True)

3. Cloud Database Export

MySQL (Local or Cloud)

# Local MySQL
sutra.save_to_mysql("localhost", "root", "password", "mydb")

# AWS RDS MySQL
sutra.save_to_mysql(
    host="mydb.xxxx.us-east-1.rds.amazonaws.com",
    user="admin",
    password="cloudpass",
    database="production"
)

# Google Cloud SQL
sutra.save_to_mysql(
    host="35.123.456.789",
    user="admin",
    password="pass",
    database="mydb"
)

PostgreSQL (Local or Cloud)

# Local PostgreSQL
sutra.save_to_postgres("localhost", "postgres", "password", "mydb")

# Heroku PostgreSQL
sutra.save_to_postgres(
    host="ec2-xxx.compute-1.amazonaws.com",
    user="user",
    password="pass",
    database="dbname"
)

# AWS RDS PostgreSQL
sutra.save_to_postgres(
    host="mydb.xxxx.us-west-2.rds.amazonaws.com",
    user="admin",
    password="pass",
    database="prod"
)

4. Complete Export & Backup

# Export entire database
sutra.export_db("backup.db", format="sqlite")
sutra.export_db("dump.sql", format="sql")
sutra.export_db("data.json", format="json")
sutra.export_db("data.xlsx", format="excel")

# Export schema only
sutra.save_schema("schema.sql", format="sql")
sutra.save_schema("schema.json", format="json")
sutra.save_schema("schema.md", format="markdown")

# Complete backup (creates 3 files)
sutra.backup()  # Creates .db, .sql, .json files with timestamp

📖 Complete Examples

Example 1: PDF Analysis

from sutra import SUTRA

sutra = SUTRA(api_key="your-openai-key")

# Upload PDF
sutra.upload("financial_report.pdf")

# View extracted data
sutra.peek(n=10)

# Query the content
result = sutra.ask("What are the total revenues?")
print(result.data)

# Visualize
result = sutra.ask("Show revenue by quarter", viz=True)

Example 2: Multi-Format Analysis

sutra = SUTRA(api_key="your-key")

# Upload multiple formats
sutra.upload("sales.csv")
sutra.upload("report.docx")
sutra.upload("data.xlsx")

# List all tables
print(sutra.tables())

# Query across data
result = sutra.ask("What are total sales?")
print(result.data)

Example 3: Cloud Deployment

# Analyze in Colab/Jupyter
sutra = SUTRA(api_key="your-key")
sutra.upload("local_analysis.csv")

# Query and analyze
result = sutra.ask("Show top performers", viz=True)

# Deploy to production MySQL
sutra.save_to_mysql(
    host="production.mysql.com",
    user="admin",
    password="prod_password",
    database="analytics_db"
)

# Backup everything
sutra.backup("/backups")

Example 4: Direct SQL (No API Cost)

# Execute SQL directly - FREE!
result = sutra.sql("""
    SELECT region, 
           SUM(sales) as total_sales,
           AVG(sales) as avg_sales
    FROM sales_data 
    GROUP BY region
    ORDER BY total_sales DESC
""")

print(result.data)

Example 5: Interactive Mode

# Ask user for visualization preference
result = sutra.interactive("What are sales trends?")
# Prompts: "Do you want visualization? (yes/no):"

if result.success:
    print(result.data)

🛠️ API Reference

Initialization

sutra = SUTRA(api_key="your-openai-key", db="sutra.db")

Upload Data

sutra.upload(data, name="table_name")
# data = file path (str) or DataFrame

View Database

sutra.tables()          # List all tables
sutra.schema()          # Show database schema
sutra.peek(n=10)       # Preview data

Query Data

# Direct SQL (no API cost)
result = sutra.sql("SELECT * FROM table", viz=False)

# Natural language (uses API)
result = sutra.ask("question", viz=False)

# Interactive (prompts user)
result = sutra.interactive("question")

Export & Backup

# Export results
sutra.export(dataframe, "output.csv", format="csv")

# Export database
sutra.export_db("backup.db", format="sqlite")

# Save to cloud
sutra.save_to_mysql(host, user, password, database)
sutra.save_to_postgres(host, user, password, database)

# Complete backup
sutra.backup("/backup/path")

QueryResult Object

result.success   # bool - query succeeded
result.sql       # str - generated SQL
result.data      # DataFrame - query results
result.viz       # figure - visualization (if viz=True)
result.error     # str - error message (if failed)

💡 Use Cases

Data Analysis

sutra.upload("sales_data.csv")
result = sutra.ask("What products have declining sales?", viz=True)

Document Processing

sutra.upload("contract.pdf")
result = sutra.ask("What are the key terms and dates?")

Multi-Source Integration

sutra.upload("sales.csv")
sutra.upload("inventory.xlsx")
sutra.upload("report.docx")
result = sutra.ask("Combine all data sources")

Cloud Migration

# Local analysis
sutra.upload("data.csv")
result = sutra.ask("Analyze trends")

# Deploy to cloud
sutra.save_to_postgres("cloud-db.com", "user", "pass", "prod")

🎨 Features Comparison

Feature Available Cost
CSV/Excel/JSON Upload Free
PDF Upload Free
DOCX Upload Free
Direct SQL Queries Free
Natural Language Queries ~$0.001/query
Visualization Free
MySQL Export Free
PostgreSQL Export Free
Backup & Export Free

💰 Cost Optimization

# FREE - Direct SQL (no API calls)
result = sutra.sql("SELECT * FROM data WHERE sales > 1000")

# PAID - Natural language (uses OpenAI API)
result = sutra.ask("Show products with sales over 1000")

# Tip: Use direct SQL when you know the query!

🧪 Testing

# Install
pip install QuerySUTRA

# Test
python -c "from sutra import SUTRA; print('✅ Success!')"

📚 Documentation

  • Full Guide: See SUTRA_Complete_Guide.ipynb
  • Publishing: See PUBLISHING_GUIDE.md
  • Examples: See complete_example.py

🤝 Contributing

Contributions welcome! The main code is in sutra/sutra.py - a single, well-documented file.


📄 License

MIT License - Free to use in your projects!


🏆 Why QuerySUTRA?

  • SUTRA = Structured-Unstructured-Text-Retrieval-Architecture
  • Single-file design for simplicity
  • Production-ready with error handling
  • Cloud-native with export capabilities
  • Comprehensive format support (PDF, DOCX, CSV, Excel, JSON)
  • Cost-effective with free SQL mode

🌟 Credits

Author: Aditya Batta
Version: 0.1.3
License: MIT


📞 Support


Made with ❤️ for data analysts and developers

Start analyzing with natural language today! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

querysutra-0.1.4.tar.gz (43.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

querysutra-0.1.4-py3-none-any.whl (46.1 kB view details)

Uploaded Python 3

File details

Details for the file querysutra-0.1.4.tar.gz.

File metadata

  • Download URL: querysutra-0.1.4.tar.gz
  • Upload date:
  • Size: 43.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for querysutra-0.1.4.tar.gz
Algorithm Hash digest
SHA256 f78ef7b1c0222e338cb133ad99cea783b38805a7280c1eaaf915f7cc3bdec144
MD5 4a517ecb31e3609117e6c66b022799fe
BLAKE2b-256 dd438fba16a00691a8f5645d9f949800001a4a76e07673d100304ca72bed7860

See more details on using hashes here.

File details

Details for the file querysutra-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: querysutra-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 46.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for querysutra-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 f6a0800935da674ac34e9dd340f79665c7ec5da8c36088dd580177f06d5415ef
MD5 bdff8967cdc75a594701599191af5276
BLAKE2b-256 8a7883097ee5f69fcded5ff919577e1e25b7b94d54f52b303bca0e9bf2ae0854

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page