Skip to main content

Natural Language Model Database - Query databases using natural language

Project description

NLMDB: Natural Language & MCP-powered Database

NLMDB is a Python library that allows you to query databases using natural language through the Model Context Protocol (MCP) approach. The library provides a simple API for interacting with databases using either OpenAI or Hugging Face models.

Features

  • Query databases using natural language
  • Support for both OpenAI and Hugging Face models
  • Enhanced privacy options with local Hugging Face models
  • Automatic schema extraction
  • Clean, professional responses
  • Simple, intuitive API

Installation

pip install nlmdb

Quick Start

Using OpenAI

from nlmdb import dbagent

# Initialize the agent with your API key and database path
response = dbagent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="What tables are in the database and what columns do they have?"
)

print(response["output"])

Using Hugging Face

from nlmdb import dbagent_private

# Initialize the agent with your Hugging Face token and model name
response = dbagent_private(
    hf_config=("your-huggingface-token", "model-repo-name"),
    db_path="path/to/your/database.db",
    query="What tables are in the database and what columns do they have?"
)

print(response["output"])

Privacy and Data Security

NLMDB offers enhanced privacy options through its support for Hugging Face models:

Enhanced Privacy with Hugging Face Models

When using dbagent_private with use_local=True, all processing happens locally on your machine, ensuring your database schema and query data never leave your environment:

response = dbagent_private(
    hf_config=("your-huggingface-token", "model-repo-name"),
    db_path="path/to/your/database.db",
    query="What tables are in the database?",
    use_local=True  # Ensures all processing happens locally
)

Data Security Considerations

  • OpenAI Integration: When using dbagent with OpenAI models, database schema and queries are sent to OpenAI's API. While only schema information and not actual data is shared, consider privacy implications.

  • Hugging Face Cloud API: Using dbagent_private without use_local=True sends queries to Hugging Face's Inference API.

  • Local Processing: For maximum privacy, use dbagent_private with use_local=True to keep all processing on your machine.

  • No Data Storage: NLMDB does not store or log your database contents, queries, or responses.

Advanced Usage

Running with Verbose Output

You can enable verbose output to see the SQL queries being generated and executed:

response = dbagent(
    api_key="your-openai-api-key",
    db_path="path/to/your/database.db",
    query="How many customers do we have?",
    verbose=True
)

Using Local Hugging Face Models

For improved performance, privacy, or when working offline, you can run Hugging Face models locally:

response = dbagent_private(
    hf_config=("your-huggingface-token", "model-repo-name"),
    db_path="path/to/your/database.db",
    query="What tables are in the database?",
    use_local=True  # This will download and run the model locally
)

Customizing Model Parameters

You can customize the behavior of the language model by passing additional parameters:

model_kwargs = {
    "temperature": 0.2,
    "max_new_tokens": 1024,
    "repetition_penalty": 1.1
}

response = dbagent_private(
    hf_config=("your-huggingface-token", "mistralai/Mixtral-8x7B-Instruct-v0.1"),
    db_path="path/to/your/database.db",
    query="Summarize the sales data for the last quarter",
    model_kwargs=model_kwargs
)

Choosing the Right Model

OpenAI Models (dbagent)

  • Pros: Higher accuracy, better SQL generation
  • Cons: Requires internet connection, sends schema information to OpenAI

Hugging Face Models (dbagent_private)

  • Pros:
    • Enhanced privacy when run locally
    • Works offline when using local models
    • Open-source options available
  • Cons:
    • May require significant local resources for larger models
    • Generally less accurate SQL generation than OpenAI models

Recommended Hugging Face Models

For optimal results with dbagent_private, we recommend:

  • mistralai/Mixtral-8x7B-Instruct-v0.1 (Best overall performance)
  • meta-llama/Llama-2-7b-chat-hf (Good balance of performance and resource usage)
  • Qwen/Qwen2-7B-Instruct (Efficient for simpler queries)

Supported Databases

Currently, NLMDB supports:

  • SQLite

Future releases will add support for:

  • PostgreSQL
  • MySQL
  • Microsoft SQL Server

Requirements

  • Python 3.8+
  • openai>=1.0.0
  • langchain>=0.1.0
  • langchain-core>=0.1.0
  • langchain-community>=0.0.0
  • langchain-huggingface>=0.0.1 (for Hugging Face integration)

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Acknowledgements

This library is built on top of:

  • LangChain
  • OpenAI API
  • Hugging Face Inference API
  • SQLite

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlmdb-1.3.2.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nlmdb-1.3.2-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file nlmdb-1.3.2.tar.gz.

File metadata

  • Download URL: nlmdb-1.3.2.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for nlmdb-1.3.2.tar.gz
Algorithm Hash digest
SHA256 f3b35d075454c7fbe419482d8e9fe3828235cae1b7b26fa3946bc05c04782e66
MD5 7827e4286b645a3e35a8f2895076eea0
BLAKE2b-256 8dc4addd7cf030715d9b12e0f1608761202b5b2b960c74960e9ca6b7902b7d4f

See more details on using hashes here.

File details

Details for the file nlmdb-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: nlmdb-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for nlmdb-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 882f3f1a537866bbdf3c8ef867083c5dab66a530daa1c2bc675a0b292f3e9cb2
MD5 b056e7a92c5e88f21bfda75681e83045
BLAKE2b-256 d6fa59835ca72bd2e77e02828bd3c50ed9c6522b9b3e8aa367c2cf7ca8065777

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page