SUTRA: Structured-Unstructured-Text-Retrieval-Architecture - Creates multiple structured tables from ANY data

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3
Topic
- Database

Project description

QuerySUTRA

SUTRA: Structured-Unstructured-Text-Retrieval-Architecture

Transform any data into structured, queryable databases with AI-powered entity extraction.

🎯 Key Features

✅ Multi-Table Creation - Automatically extracts entities and creates multiple related tables
✅ Smart Entity Extraction - Identifies people, contacts, events, organizations from unstructured data
✅ Natural Language Queries - Ask questions in plain English
✅ Multiple Data Formats - CSV, Excel, JSON, PDF, DOCX, TXT, SQL, DataFrames
✅ Direct SQL Access - Query without API costs
✅ Auto Visualization - Built-in charts and graphs
✅ Cloud Export - Save to MySQL, PostgreSQL, or local SQLite

📦 Installation

pip install QuerySUTRA

# With MySQL support
pip install QuerySUTRA[mysql]

# With PostgreSQL support
pip install QuerySUTRA[postgres]

# With all database support
pip install QuerySUTRA[all]

🚀 Quick Start

from sutra import SUTRA

# Initialize
sutra = SUTRA(api_key="your-openai-key")

# Upload any data - AI creates multiple structured tables!
sutra.upload("employee_story.pdf")

# View all created tables
sutra.tables()
# Output:
# 📋 TABLES IN DATABASE
# 1. employee_story_people (20 rows, 6 columns)
#    Columns: id, name, address, city, email, phone
# 2. employee_story_contacts (20 rows, 4 columns)
#    Columns: id, person_id, email, phone
# 3. employee_story_events (15 rows, 4 columns)
#    Columns: id, host_id, description, city

# View detailed schema
sutra.schema()

# Query with natural language
result = sutra.ask("Show all people from New York")
print(result.data)

# With visualization
result = sutra.ask("Show events by city", viz=True)

# Direct SQL (no API cost!)
result = sutra.sql("SELECT * FROM employee_story_people WHERE city='Dallas'")
print(result.data)

📊 How It Works

From Unstructured PDF to Structured Tables

Input: PDF with employee information

AI Automatically Creates:

📋 Created 3 structured tables:
  📊 employee_story_people: 20 rows, 6 columns
     - id, name, address, city, email, phone
  📊 employee_story_contacts: 20 rows, 4 columns
     - id, person_id, email, phone  
  📊 employee_story_events: 15 rows, 4 columns
     - id, host_id, description, city

💡 Usage Examples

1. Upload Different Formats

# CSV file
sutra.upload("sales_data.csv")

# Excel file
sutra.upload("quarterly_report.xlsx")

# PDF document (AI extracts entities!)
sutra.upload("company_directory.pdf")

# Word document
sutra.upload("meeting_notes.docx")

# Text file
sutra.upload("log_data.txt")

# DataFrame
import pandas as pd
df = pd.DataFrame({'name': ['Alice', 'Bob'], 'score': [95, 87]})
sutra.upload(df, name="test_scores")

2. View Your Data

# List all tables with details
sutra.tables()

# Show schema with data types
sutra.schema()

# Show schema for specific table
sutra.schema("employee_story_people")

# Preview data
sutra.peek("employee_story_people", n=10)

3. Query Your Data

# Natural language (uses OpenAI)
result = sutra.ask("What are the top 5 sales by region?")
print(result.data)

# With visualization
result = sutra.ask("Show sales trends by month", viz=True)

# Interactive mode (asks if you want viz)
result = sutra.interactive("Compare revenue across quarters")

# Direct SQL (free, no API!)
result = sutra.sql("SELECT city, COUNT(*) as count FROM employee_story_people GROUP BY city")
print(result.data)

4. Export Your Database

# Export to MySQL (local or cloud)
sutra.save_to_mysql(
    host="localhost",
    user="root",
    password="password",
    database="my_database"
)

# Export to PostgreSQL
sutra.save_to_postgres(
    host="mydb.amazonaws.com",
    user="admin",
    password="password",
    database="production_db"
)

# Export to SQLite file
sutra.export_db("backup.db", format="sqlite")

# Export to SQL dump
sutra.export_db("schema.sql", format="sql")

# Export to JSON
sutra.export_db("data.json", format="json")

# Export to Excel (all tables as sheets)
sutra.export_db("data.xlsx", format="excel")

# Complete backup
sutra.backup("./backups")

🔥 Advanced Features

Entity Extraction

QuerySUTRA automatically identifies and extracts:

👥 People - Names, addresses, contact info
📧 Contacts - Emails, phone numbers
📅 Events - Meetings, activities, locations
🏢 Organizations - Companies, departments
📍 Locations - Cities, addresses, coordinates

Multiple Table Relationships

# AI creates relational structure
sutra.upload("company_data.pdf")

# Result:
# people table with person_id
# contacts table with foreign key to person_id
# events table with host_id linking to people

Query Across Tables

# Natural language handles joins automatically
result = sutra.ask("Show all events hosted by people from Dallas")

# Or write SQL joins manually
result = sutra.sql("""
    SELECT e.description, p.name, p.city
    FROM employee_story_events e
    JOIN employee_story_people p ON e.host_id = p.id
    WHERE p.city = 'Dallas'
""")

📈 Visualization

# Auto-detect best chart type
result = sutra.ask("Show revenue by product", viz=True)

# Interactive charts with Plotly
# - Bar charts for categorical data
# - Line charts for time series  
# - Tables for detailed data
# - Pie charts for distributions

🌐 Cloud Database Integration

AWS RDS MySQL

sutra.save_to_mysql(
    host="mydb.xxxx.us-east-1.rds.amazonaws.com",
    user="admin",
    password="password",
    database="production",
    port=3306
)

Google Cloud SQL

sutra.save_to_postgres(
    host="35.123.456.789",
    user="postgres",
    password="password",
    database="analytics"
)

Heroku Postgres

sutra.save_to_postgres(
    host="ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com",
    user="username",
    password="password",
    database="dbname",
    port=5432
)

⚡ Performance Tips

# Use direct SQL for complex queries (faster, no API cost)
result = sutra.sql("SELECT * FROM data WHERE status='active'")

# Cache is automatic for repeated questions
result1 = sutra.ask("Show total sales")  # Calls API
result2 = sutra.ask("Show total sales")  # From cache ⚡

# Export results for reuse
result.data.to_csv("results.csv")

🔒 API Key Security

# Option 1: Pass directly (not recommended for production)
sutra = SUTRA(api_key="sk-...")

# Option 2: Environment variable (recommended)
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
sutra = SUTRA()

# Option 3: .env file
# Create .env file with: OPENAI_API_KEY=sk-...
from dotenv import load_dotenv
load_dotenv()
sutra = SUTRA()

🎓 Complete Example

from sutra import SUTRA
import pandas as pd

# Initialize
sutra = SUTRA(api_key="your-openai-key")

# Upload PDF - creates multiple tables
sutra.upload("employee_directory.pdf")

# View what was created
tables_info = sutra.tables()
print(f"Created {len(tables_info)} tables")

# View detailed schema
sutra.schema()

# Query specific table
result = sutra.ask("How many people are in each city?", 
                   table="employee_directory_people")
print(result.data)

# Visualize
result = sutra.ask("Show distribution of people by city", viz=True)

# Export to MySQL
sutra.save_to_mysql("localhost", "root", "password", "company_db")

# Backup everything
sutra.backup("./backups")

# Close connection
sutra.close()

📚 Method Reference

Core Methods

Method	Description
`upload(data, name)`	Upload any data format, creates multiple tables
`tables()`	List all tables with row/column counts
`schema(table)`	Show detailed schema with data types
`peek(table, n)`	Preview first n rows
`ask(question, viz)`	Natural language query
`sql(query, viz)`	Direct SQL query
`interactive(question)`	Query with viz prompt

Export Methods

Method	Description
`export_db(path, format)`	Export database (sqlite/sql/json/excel)
`save_to_mysql(...)`	Save to MySQL database
`save_to_postgres(...)`	Save to PostgreSQL database
`backup(path)`	Complete backup with timestamp

🐛 Troubleshooting

Q: Only one table created instead of multiple?
A: Make sure you have OpenAI API key set. Without it, falls back to simple parsing.

Q: "No API key" error?
A: Set your OpenAI key: sutra = SUTRA(api_key="sk-...")

Q: PDF extraction failed?
A: Install PyPDF2: pip install PyPDF2

Q: MySQL export error?
A: Install extras: pip install QuerySUTRA[mysql]

📄 License

MIT License - see LICENSE file

🤝 Contributing

Contributions welcome! Open an issue or submit a PR.

📞 Support

Issues: GitHub Issues
Email: your@email.com

Made with ❤️ by Aditya Batta

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3
Topic
- Database

Release history Release notifications | RSS feed

0.6.2

Feb 6, 2026

0.6.1

Feb 5, 2026

0.6.0

Feb 5, 2026

0.5.3

Nov 18, 2025

0.5.2

Nov 17, 2025

0.5.1

Nov 17, 2025

0.5.0

Nov 17, 2025

0.4.6

Nov 17, 2025

0.4.5

Nov 17, 2025

0.4.4

Nov 17, 2025

0.4.3

Nov 17, 2025

0.4.2

Nov 17, 2025

0.4.1

Nov 17, 2025

0.4.0

Nov 17, 2025

0.3.3

Nov 16, 2025

0.3.2

Nov 14, 2025

0.3.1

Nov 14, 2025

0.3.0

Nov 14, 2025

This version

0.2.3

Nov 14, 2025

0.2.1

Nov 14, 2025

0.2.0

Nov 14, 2025

0.1.4

Nov 13, 2025

0.1.3

Nov 13, 2025

0.1.2

Nov 13, 2025

0.1.0

Nov 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

querysutra-0.2.3.tar.gz (46.1 kB view details)

Uploaded Nov 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

querysutra-0.2.3-py3-none-any.whl (47.2 kB view details)

Uploaded Nov 14, 2025 Python 3

File details

Details for the file querysutra-0.2.3.tar.gz.

File metadata

Download URL: querysutra-0.2.3.tar.gz
Upload date: Nov 14, 2025
Size: 46.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for querysutra-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`e5d0a8d4a907d950c6268e8758ccdbfee8e6a3580dcc7fb126669e1e7b65c608`
MD5	`b381708ba24d28dca0f044cd1adc69c2`
BLAKE2b-256	`01196ff918f86324f71d5c22cc2204e922023439ae1510a011fb8f9b8130edcc`

See more details on using hashes here.

File details

Details for the file querysutra-0.2.3-py3-none-any.whl.

File metadata

Download URL: querysutra-0.2.3-py3-none-any.whl
Upload date: Nov 14, 2025
Size: 47.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for querysutra-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9746ca5baedd48355c35a5680aa16c2f4779b7e18a15919aad4c99c068f3b621`
MD5	`f39c7ae288c06d86cf6525b0c823db7a`
BLAKE2b-256	`071f5ebaff7b11bef51a8c909e554878e24b01562efccd5d253e42cd01cba11b`

See more details on using hashes here.

QuerySUTRA 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

QuerySUTRA

🎯 Key Features

📦 Installation

🚀 Quick Start

📊 How It Works

From Unstructured PDF to Structured Tables

💡 Usage Examples

1. Upload Different Formats

2. View Your Data

3. Query Your Data

4. Export Your Database

🔥 Advanced Features

Entity Extraction

Multiple Table Relationships

Query Across Tables

📈 Visualization

🌐 Cloud Database Integration

AWS RDS MySQL

Google Cloud SQL

Heroku Postgres

⚡ Performance Tips

🔒 API Key Security

🎓 Complete Example

📚 Method Reference

Core Methods

Export Methods

🐛 Troubleshooting

📄 License

🤝 Contributing

📞 Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes