AI-powered data analysis for structured and unstructured data. Query PDF, Word, CSV, Excel with natural language.

These details have not been verified by PyPI

Project links

Project description

QuerySUTRA

SUTRA: Structured-Unstructured-Text-Retrieval-Architecture

AI-powered data analysis. Upload any data (PDF, Word, Text, CSV, Excel), query with natural language, export to MySQL.

Installation

pip install QuerySUTRA
pip install QuerySUTRA[mysql]       # MySQL support
pip install QuerySUTRA[embeddings]  # Smart caching
pip install QuerySUTRA[all]         # All features

Quick Start

from sutra import SUTRA

sutra = SUTRA(api_key="your-openai-key")
sutra.upload("data.pdf")  # or .docx, .txt, .csv, .xlsx, .json
result = sutra.ask("Show me all people")
print(result.data)

Supported Formats

Structured Data:

CSV (.csv)
Excel (.xlsx, .xls)
JSON (.json)
SQL (.sql)
Pandas DataFrame

Unstructured Documents (AI Extraction):

PDF (.pdf)
Word (.docx)
Text (.txt)

Core Features

1. Upload Any Data Format

# Structured data
sutra.upload("sales.csv")
sutra.upload("report.xlsx")
sutra.upload("api_data.json")
sutra.upload("dump.sql")

# Unstructured documents (AI extracts entities)
sutra.upload("resume.pdf")
sutra.upload("meeting_notes.docx")
sutra.upload("transcript.txt")

# DataFrame
import pandas as pd
df = pd.DataFrame({'name': ['Alice'], 'score': [95]})
sutra.upload(df, name="scores")

2. Complete Data Extraction

Processes entire documents in chunks. No data loss.

# PDF - Extracts ALL pages
sutra.upload("50_page_report.pdf")  # Gets all 50 pages, all employees

# Word - Extracts ALL content
sutra.upload("large_document.docx")  # Full document processed

# Text - Processes ALL lines
sutra.upload("log_file.txt")  # Entire file analyzed

# All create multiple related tables
sutra.tables()

3. Automatic MySQL Export

One-line upload and export. Database auto-created.

sutra.upload("data.pdf", auto_export_mysql={
    'host': 'localhost',
    'user': 'root',
    'password': 'your_password',
    'database': 'my_database'  # Auto-creates if not exists
})

4. Natural Language Queries

result = sutra.ask("Show all people from California")
result = sutra.ask("Who has Python skills?", table="skills")
result = sutra.ask("Count employees by state", viz="pie")

5. Custom Visualizations

result = sutra.ask("Sales by region", viz="pie")       # Pie chart
result = sutra.ask("Trends over time", viz="line")     # Line chart
result = sutra.ask("Compare values", viz="bar")        # Bar chart
result = sutra.ask("Correlations", viz="scatter")      # Scatter
result = sutra.ask("Show table", viz="table")          # Table
result = sutra.ask("Heatmap", viz="heatmap")           # Heatmap
result = sutra.ask("Auto", viz=True)                   # Auto-detect

6. Load Existing Databases

# Load SQLite
sutra = SUTRA.load_from_db("data.db", api_key="key")

# Connect to MySQL
sutra = SUTRA.connect_mysql("localhost", "root", "pass", "database")

# Connect to PostgreSQL  
sutra = SUTRA.connect_postgres("localhost", "postgres", "pass", "database")

7. Fuzzy Matching

sutra = SUTRA(api_key="key", fuzzy_match=True)

# "New York City" matches "New York" automatically
result = sutra.ask("Who are from New York City?")
# Fuzzy: 'City' -> 'New York'

Uses difflib.get_close_matches with 60% threshold.

8. Embeddings for Smart Caching

Save 90% on API costs.

sutra = SUTRA(api_key="key", use_embeddings=True)

result = sutra.ask("Show sales")            # API call
result = sutra.ask("Display sales data")    # Cached (92% similar)
result = sutra.ask("Give me sales info")    # Cached (88% similar)

How it works:

Model: all-MiniLM-L6-v2 (80MB, runs locally)
Converts queries to 384D vectors
85% similarity threshold
No external API calls

Cost savings:

10 similar queries: 1 API call vs 10 = 90% savings

9. Irrelevant Query Detection

sutra = SUTRA(api_key="key", check_relevance=True)

result = sutra.ask("What's the weather?")
# Warning: Query may be irrelevant

10. Direct SQL

result = sutra.sql("SELECT * FROM people WHERE state='CA'")

Complete Example

from sutra import SUTRA

# Initialize with all features
sutra = SUTRA(
    api_key="your-key",
    use_embeddings=True,
    fuzzy_match=True,
    check_relevance=True
)

# Upload any format
sutra.upload("employees.pdf")      # PDF
sutra.upload("skills.docx")        # Word
sutra.upload("projects.txt")       # Text
sutra.upload("sales.csv")          # CSV
sutra.upload("budget.xlsx")        # Excel

# View tables
sutra.tables()

# Query
result = sutra.ask("Show all people", viz="bar")

# Export to MySQL
sutra.save_to_mysql("localhost", "root", "pass", "my_db")

Import to MySQL Workflow

Colab:

sutra.upload("data.pdf")
sutra.export_db("data.db", "sqlite")
from google.colab import files
files.download("data.db")

Windows:

sutra = SUTRA.load_from_db("data.db", api_key="key")
sutra.save_to_mysql("localhost", "root", "pass", "my_db")

Export Options

sutra.export_db("backup.db", "sqlite")
sutra.export_db("schema.sql", "sql")
sutra.export_db("data.json", "json")
sutra.export_db("data.xlsx", "excel")
sutra.save_to_mysql("localhost", "root", "pass", "db")
sutra.save_to_postgres("localhost", "postgres", "pass", "db")

API Reference

Methods

upload(data, name, auto_export_mysql) - Upload any format
ask(question, viz, table) - Natural language query
sql(query, viz) - Direct SQL
tables() - List tables
schema() - Show schema
peek(table, n) - Preview
save_to_mysql(...) - Export MySQL (auto-creates DB)
export_db(path, format) - Export database
load_from_db(path) - Load SQLite
connect_mysql(...) - Connect MySQL

Requirements

Python 3.8+, OpenAI API key

License

MIT

Made by Aditya Batta

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.2

Feb 6, 2026

0.6.1

Feb 5, 2026

This version

0.6.0

Feb 5, 2026

0.5.3

Nov 18, 2025

0.5.2

Nov 17, 2025

0.5.1

Nov 17, 2025

0.5.0

Nov 17, 2025

0.4.6

Nov 17, 2025

0.4.5

Nov 17, 2025

0.4.4

Nov 17, 2025

0.4.3

Nov 17, 2025

0.4.2

Nov 17, 2025

0.4.1

Nov 17, 2025

0.4.0

Nov 17, 2025

0.3.3

Nov 16, 2025

0.3.2

Nov 14, 2025

0.3.1

Nov 14, 2025

0.3.0

Nov 14, 2025

0.2.3

Nov 14, 2025

0.2.1

Nov 14, 2025

0.2.0

Nov 14, 2025

0.1.4

Nov 13, 2025

0.1.3

Nov 13, 2025

0.1.2

Nov 13, 2025

0.1.0

Nov 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

querysutra-0.6.0.tar.gz (44.4 kB view details)

Uploaded Feb 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

querysutra-0.6.0-py3-none-any.whl (46.5 kB view details)

Uploaded Feb 5, 2026 Python 3

File details

Details for the file querysutra-0.6.0.tar.gz.

File metadata

Download URL: querysutra-0.6.0.tar.gz
Upload date: Feb 5, 2026
Size: 44.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for querysutra-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`68229133e9f1c4067f2f5db8f637c07613673074d99053c17d3c96fa1c198b52`
MD5	`f03a066828d3befd9319d75e481948ea`
BLAKE2b-256	`f374248b71e75be1226823b85f21ebedd082e4c9ed65488c983c4b42a9615625`

See more details on using hashes here.

File details

Details for the file querysutra-0.6.0-py3-none-any.whl.

File metadata

Download URL: querysutra-0.6.0-py3-none-any.whl
Upload date: Feb 5, 2026
Size: 46.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for querysutra-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a51d8177f46fad2b22f9adb06fc1195cc2a09c2720645a186ad1cec373bf6676`
MD5	`6265bdd5a7f69d39de03255e2cd70467`
BLAKE2b-256	`f0c13a8ece2f1a2b0b63d8de6ca2d3f49e8cc08bd83e65d6151f2ca50828d488`

See more details on using hashes here.

QuerySUTRA 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

QuerySUTRA

Installation

Quick Start

Supported Formats

Core Features

1. Upload Any Data Format

2. Complete Data Extraction

3. Automatic MySQL Export

4. Natural Language Queries

5. Custom Visualizations

6. Load Existing Databases

7. Fuzzy Matching

8. Embeddings for Smart Caching

9. Irrelevant Query Detection

10. Direct SQL

Complete Example

Import to MySQL Workflow

Export Options

API Reference

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes