Skip to main content

Natural Language Model Database - Query databases using natural language

Project description

NLMDB: Natural Language & MCP-powered Database

NLMDB Logo

PyPI version Python Versions License: MIT

Query your databases using natural language through the Model Context Protocol (MCP) approach. NLMDB provides a simple API for interacting with databases using either OpenAI or Hugging Face models.

✨ Features

  • 💬 Query databases using natural language
  • 🔄 Support for both OpenAI and Hugging Face models
  • 🔒 Enhanced privacy options with local Hugging Face models
  • 📊 Automatic schema extraction
  • 📈 Generate visualizations from your data
  • 📝 Multiple output formats: explanatory text, raw data, or visualizations
  • 🧩 Simple, intuitive API

🚀 Installation

pip install nlmdb

🏁 Quick Start

Natural Language Explanations Mode

from nlmdb import dbagent

# Initialize the agent with your API key and database path
response = dbagent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="What tables are in the database and what columns do they have?"
)

print(response["output"])

SQL Agent Mode

Get direct results without explanations - perfect for data analysis workflows:

from nlmdb import sql_agent
import pandas as pd

# Get results as a pandas DataFrame
df = sql_agent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="List all customers who made purchases over $1000",
    return_type="dataframe"  # Options: "dataframe", "dict", or "json"
)

# Now you can directly work with the data
print(df.head())

Visualization Agent Mode (New!)

Generate interactive visualizations directly from your database with natural language:

from nlmdb import viz_agent

# Create a visualization
fig = viz_agent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="Show me a bar chart of sales by product category"
)

# Display the interactive plot
fig.show()

# Save the plot to HTML
fig.write_html("sales_by_category.html")

Using Hugging Face Models

from nlmdb import dbagent_private

# Initialize the agent with your Hugging Face token and model name
response = dbagent_private(
    hf_config=("your-huggingface-token", "model-repo-name"),
    db_path="path/to/your/database.db",
    query="What tables are in the database and what columns do they have?"
)

print(response["output"])

🔒 Privacy and Data Security

NLMDB offers enhanced privacy options through its support for Hugging Face models:

Enhanced Privacy with Hugging Face Models

When using dbagent_private, sql_agent_private, or viz_agent_private with use_local=True, all processing happens locally on your machine, ensuring your database schema and query data never leave your environment:

response = dbagent_private(
    hf_config=("your-huggingface-token", "model-repo-name"),
    db_path="path/to/your/database.db",
    query="What tables are in the database?",
    use_local=True  # Ensures all processing happens locally
)

Data Security Considerations

  • OpenAI Integration: When using dbagent with OpenAI models, database schema and queries are sent to OpenAI's API. While only schema information and not actual data is shared, consider privacy implications.

  • Hugging Face Cloud API: Using dbagent_private without use_local=True sends queries to Hugging Face's Inference API.

  • Local Processing: For maximum privacy, use dbagent_private with use_local=True to keep all processing on your machine.

  • No Data Storage: NLMDB does not store or log your database contents, queries, or responses.

🔄 Model Comparison

Feature OpenAI Models (dbagent/sql_agent/viz_agent) Hugging Face Models (dbagent_private/sql_agent_private/viz_agent_private)
SQL Generation Quality ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
Privacy ⭐⭐ ⭐⭐⭐⭐⭐ (with use_local=True)
Cost 💰💰💰 💰 (self-hosted) / 💰💰 (HF API)
Offline Usage ✅ (with use_local=True)
Setup Complexity Simple Moderate
Resource Requirements Minimal (Cloud-based) High (for local models)
Speed Fast Varies (depends on hardware)
Customizability Limited Extensive

🧩 Advanced Usage

Visualization Agent with Different Output Formats

# Get HTML output for embedding in web applications
html_output = viz_agent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="Create a pie chart showing the distribution of sales by region",
    return_fig=False,
    fig_format="html"
)

# Save to an HTML file
with open("sales_distribution.html", "w") as f:
    f.write(html_output)

# Get JSON specification for further customization
json_spec = viz_agent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="Show me a scatter plot of product price vs. sales volume",
    return_fig=False,
    fig_format="json"
)

SQL Agent with Different Return Types

# Get results as a dictionary
result_dict = sql_agent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="Find the total sales by product category",
    return_type="dict"
)

# Get results as JSON
json_result = sql_agent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="Show me monthly sales trends",
    return_type="json"
)

# SQL Agent with Hugging Face for privacy
df = sql_agent_private(
    hf_config=("your-huggingface-token", "mistralai/Mixtral-8x7B-Instruct-v0.1"),
    db_path="path/to/your/database.db",
    query="List customers in California",
    return_type="dataframe",
    use_local=True  # For local processing
)

Customizing Model Parameters

model_kwargs = {
    "temperature": 0.2,
    "max_new_tokens": 1024,
    "repetition_penalty": 1.1
}

response = dbagent_private(
    hf_config=("your-huggingface-token", "mistralai/Mixtral-8x7B-Instruct-v0.1"),
    db_path="path/to/your/database.db",
    query="Summarize the sales data for the last quarter",
    model_kwargs=model_kwargs
)

🔍 Choosing the Right Mode & Model

Modes

Mode Functions Best For Output
Explanatory dbagent, dbagent_private Understanding data context Natural language explanations with insights
SQL Agent sql_agent, sql_agent_private Data analysis, integration Raw data as DataFrame, dict, or JSON
Visualization viz_agent, viz_agent_private Data visualization, reporting Interactive Plotly visualizations

Recommended Hugging Face Models

Model Performance Resource Usage Best For
mistralai/Mixtral-8x7B-Instruct-v0.1 ⭐⭐⭐⭐⭐ 🖥️🖥️🖥️🖥️ Best overall SQL generation
meta-llama/Llama-2-7b-chat-hf ⭐⭐⭐⭐ 🖥️🖥️🖥️ Balance of performance and resources
Qwen/Qwen2-7B-Instruct ⭐⭐⭐ 🖥️🖥️ Efficient for simpler queries

📊 Supported Databases

Currently, NLMDB supports:

  • SQLite ✅

Future releases will add support for:

  • PostgreSQL 🔜
  • MySQL 🔜
  • Microsoft SQL Server 🔜

⚙️ Requirements

  • Python 3.8+
  • openai>=1.0.0
  • langchain>=0.1.0
  • langchain-core>=0.1.0
  • langchain-community>=0.0.0
  • langchain-huggingface>=0.0.1 (for Hugging Face integration)
  • pandas>=1.0.0 (for DataFrame return type in SQL agent mode)
  • plotly>=5.0.0 (for visualization agent mode)

📜 License

MIT

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

🙏 Acknowledgements

LangChain OpenAI Hugging Face SQLite pandas Plotly

This library is built on top of:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlmdb-1.3.8.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nlmdb-1.3.8-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file nlmdb-1.3.8.tar.gz.

File metadata

  • Download URL: nlmdb-1.3.8.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for nlmdb-1.3.8.tar.gz
Algorithm Hash digest
SHA256 e66925927570b6cf37f4e982943a86fc26fd6214ab558a954be4e3e2d32d9c6f
MD5 99cb42eb26672a24782f583afaf65486
BLAKE2b-256 8eb1404001c3f782f1517f4ea2c8b63891ba5732be83dd2a7c5b71ec028e4efc

See more details on using hashes here.

File details

Details for the file nlmdb-1.3.8-py3-none-any.whl.

File metadata

  • Download URL: nlmdb-1.3.8-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for nlmdb-1.3.8-py3-none-any.whl
Algorithm Hash digest
SHA256 24df9eb2132b39ae4723c52452e538739d7c2cfc97131ad48d0615d7a2bee875
MD5 b8512b01f3f91badfe50cc746f7548bd
BLAKE2b-256 abbb8241ed208a93d17422ef4ff7ac68e8925004480eb2d06968b3f2a34461fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page