Skip to main content

Natural Language Database Queries with AI-Powered Test Data Generation

Project description

QueryNL CLI - Natural Language Database Queries from the Terminal

A powerful command-line interface for executing database queries using natural language. QueryNL CLI provides terminal-native access to database querying, connection management, and query history tracking.

๐ŸŽฏ MVP Features (Version 0.1.0)

This is the Minimum Viable Product implementation with core functionality:

โœ… Implemented Features

  • Connection Management (User Story 2 - P1)

    • Add/list/test/use/remove database connections
    • Secure credential storage via OS keychain
    • Support for PostgreSQL, MySQL, SQLite, MongoDB
    • SSH tunnel configuration
    • Environment variable support for CI/CD
  • Natural Language Queries (User Story 1 - P1)

    • Execute database queries using natural language
    • ๐Ÿ†• Real LLM Integration (OpenAI GPT-4 & Anthropic Claude)
    • SQL generation with syntax highlighting preview
    • Confirmation prompts for destructive operations
    • Multiple output formats (table, JSON, CSV, markdown)
    • Query history tracking
    • Scriptable commands with proper exit codes
    • Automatic fallback to pattern matching if LLM unavailable

๐Ÿ†• LLM Integration (NEW!)

QueryNL now supports real AI-powered query generation:

  • Providers: OpenAI GPT-4, Anthropic Claude 3.5 Sonnet
  • Secure: API keys stored in system keychain
  • Smart: Understands database schema and context
  • Safe: Confidence scoring and destructive operation detection
  • Fallback: Works without API keys using pattern matching

Quick Setup:

querynl config llm --provider openai --test
# Enter your OpenAI API key when prompted

See LLM_SETUP.md for detailed configuration guide.

๐ŸŽ‰ Test Data Generation (NEW!)

Feature Branch: 005-add-test-data

Automatically populate your database schemas with realistic test data using natural language!

  • Natural Language: "add sample data", "add 100 users", "add e-commerce product data"
  • Smart Dependencies: Automatically handles foreign key relationships
  • Constraint Compliance: Respects UNIQUE, NOT NULL, and CHECK constraints
  • Multi-Database: Works with MySQL, PostgreSQL, and SQLite
  • Realistic Data: Uses Faker library for names, emails, addresses, dates, etc.
  • Domain Aware: Generate domain-specific data (e-commerce, blog, medical, etc.)
  • Progress Tracking: Real-time progress bars with speed and ETA
  • Error Recovery: Graceful handling of constraint violations

Quick Start:

# Create a schema
querynl> \schema design
Schema Designer> blog with users, posts, and comments
Schema Designer> finalize

# Generate test data
querynl> add sample data

# Or specify quantities
querynl> add 100 users and 500 posts

# Or with domain context
querynl> add e-commerce product data

See specs/005-add-test-data/quickstart.md for full documentation.

๐Ÿšง Coming in Future Releases

  • Migration generation (Phase 8)
  • Advanced output formatting (Phase 6)
  • Configuration management commands (Phase 10)

๐Ÿ“ฆ Installation

Prerequisites

  • Python 3.11 or higher
  • pip (Python package manager)

Install from Source

# Clone the repository
cd /Users/marcus/Developer/QueryNLAgent/QueryNL

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install in development mode
pip install -e .

Verify Installation

querynl --version
# Output: QueryNL CLI, version 0.1.0

๐Ÿš€ Quick Start

1. Add Your First Connection

# Interactive mode (recommended)
querynl connect add my-db

# You'll be prompted for:
# - Database type (postgresql, mysql, sqlite, mongodb)
# - Host (default: localhost)
# - Port (default: 5432 for PostgreSQL)
# - Database name
# - Username
# - Password (hidden input)

2. Test the Connection

querynl connect test my-db

# Output:
# Testing connection 'my-db'...
# โœ“ Connection successful
#   Database: PostgreSQL 15.2
#   Latency: 23ms

3. Execute Your First Query

querynl query exec "count all users"

# Output:
# Generated SQL:
# SELECT COUNT(*) AS count FROM users;
#
# Executing query...
# โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
# โ”‚ count โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚  1523 โ”‚
# โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
#
# 1 row returned (45ms)

๐Ÿ“š Usage Examples

Connection Management

# List all connections
querynl connect list

# Add a SQLite connection
querynl connect add local-db --type sqlite --database ./app.db

# Add with SSH tunnel
querynl connect add remote-db --ssh-tunnel

# Set default connection
querynl connect use prod-db

# Remove a connection
querynl connect remove dev-db --confirm

Executing Queries

# Basic natural language query
querynl query exec "show all active users"

# JSON output (for piping to jq)
querynl query exec --format json "count orders by status"

# Save results to file
querynl query exec --output results.csv --format csv "all orders from last week"

# Non-interactive mode (skip confirmations)
querynl query exec -y "delete from temp_table"

# Explain mode (show SQL without executing)
querynl query exec --explain "update user status"

# Use specific connection
querynl query exec --connection staging-db "list tables"

# Limit results
querynl query exec --limit 100 "all users"

Query History

# View recent queries
querynl query history

# Show last 50 queries
querynl query history --limit 50

# Filter by connection
querynl query history --connection prod-db

# JSON output
querynl query history --format json

CI/CD Integration

# Use environment variables for credentials
export QUERYNL_CONNECTION_STRING="postgresql://user:pass@host:5432/db"

# Non-interactive query execution
querynl query exec -y --format json "SELECT 1" > /dev/null
echo $?  # Exit code: 0 = success, non-zero = error

# Pipe results to other tools
querynl query exec --format json "all active users" | jq '.rows | length'

๐Ÿ”ง Configuration

Config File Location

  • Linux: ~/.config/querynl/config.yaml
  • macOS: ~/Library/Application Support/querynl/config.yaml
  • Windows: %APPDATA%\querynl\config.yaml

Config File Structure

version: "1.0"
default_connection: prod-db
default_output_format: table
llm_provider: openai
enable_telemetry: false
repl_history_size: 1000
confirm_destructive: true
color_output: auto

connections:
  prod-db:
    database_type: postgresql
    host: prod.example.com
    port: 5432
    database_name: production
    username: app_user
    ssl_enabled: true
    created_at: '2025-10-14T10:30:00'

Note: Passwords are stored separately in OS keychain, never in the config file.

Environment Variables

Variable Description
QUERYNL_CONNECTION_STRING Database connection string (bypasses keychain)
QUERYNL_KEYRING_PASSWORD Master password for encrypted file keyring (headless servers)
QUERYNL_CONFIG Custom config file path
QUERYNL_NO_COLOR Disable color output (set to 1)

๐Ÿ” Security

Credential Storage

QueryNL CLI uses platform-native credential storage:

  • macOS: Keychain Access
  • Windows: Windows Credential Manager
  • Linux: Secret Service API (GNOME Keyring, KWallet)

For headless servers without keychain access:

# Option 1: Use encrypted file keyring
export QUERYNL_KEYRING_PASSWORD="your-master-password"
querynl connect add my-db

# Option 2: Use connection string directly
export QUERYNL_CONNECTION_STRING="postgresql://user:pass@host/db"
querynl query exec "your query"

Security Features

  • โœ… Credentials encrypted at rest via OS keychain
  • โœ… Passwords never displayed in logs or error messages
  • โœ… SSL/TLS enabled by default for network connections
  • โœ… Confirmation required for destructive operations (DELETE, DROP, etc.)
  • โœ… Input sanitization to prevent SQL injection

๐Ÿ—‚๏ธ Project Structure

QueryNL/
โ”œโ”€โ”€ src/cli/
โ”‚   โ”œโ”€โ”€ __init__.py           # Version and package info
โ”‚   โ”œโ”€โ”€ main.py               # CLI entry point
โ”‚   โ”œโ”€โ”€ config.py             # Configuration management
โ”‚   โ”œโ”€โ”€ credentials.py        # Keyring integration
โ”‚   โ”œโ”€โ”€ database.py           # Database driver wrapper
โ”‚   โ”œโ”€โ”€ errors.py             # Custom exceptions
โ”‚   โ”œโ”€โ”€ history.py            # Query history tracking
โ”‚   โ”œโ”€โ”€ llm.py                # LLM service integration
โ”‚   โ”œโ”€โ”€ logging.py            # Logging configuration
โ”‚   โ”œโ”€โ”€ models.py             # Data models (ConnectionProfile, etc.)
โ”‚   โ”œโ”€โ”€ commands/
โ”‚   โ”‚   โ”œโ”€โ”€ connect.py        # Connection management commands
โ”‚   โ”‚   โ””โ”€โ”€ query.py          # Query execution commands
โ”‚   โ””โ”€โ”€ formatting/
โ”‚       โ””โ”€โ”€ table.py          # Result formatting
โ”œโ”€โ”€ tests/cli/                # Tests (coming soon)
โ”œโ”€โ”€ requirements.txt          # Python dependencies
โ”œโ”€โ”€ setup.py                  # Package setup
โ””โ”€โ”€ README.md                 # This file

๐Ÿงช Testing

# Run tests (when implemented)
pytest

# Run specific test file
pytest tests/cli/test_connect.py

# Run with coverage
pytest --cov=src/cli --cov-report=html

๐Ÿ› Troubleshooting

"Connection refused"

# Test connectivity
querynl connect test my-db

# Check if database is running
# Verify host/port are correct
# Check firewall rules

"Permission denied to access keychain" (macOS)

  1. Open Keychain Access app
  2. Find "querynl" entry
  3. Right-click โ†’ Get Info โ†’ Access Control
  4. Allow querynl to access this item

"D-Bus session not available" (Linux headless)

# Use encrypted file keyring fallback
export QUERYNL_KEYRING_PASSWORD="your-master-password"
querynl connect add my-db

"No default connection set"

# Add a connection
querynl connect add my-db

# Or use --connection flag
querynl query exec --connection my-db "your query"

๐Ÿ“– Documentation

  • Feature Specification: specs/002-command-line-interface/spec.md
  • Implementation Plan: specs/002-command-line-interface/plan.md
  • Task Breakdown: specs/002-command-line-interface/tasks.md
  • Quick start Guide: specs/002-command-line-interface/quickstart.md
  • API Contracts: specs/002-command-line-interface/contracts/

๐Ÿ—บ๏ธ Roadmap

Phase 1-4: MVP โœ… (Current Release)

  • Project setup and infrastructure
  • Connection management
  • Natural language query execution
  • Query history tracking

Phase 5: REPL Mode ๐Ÿšง (Next Release)

  • Interactive shell with history
  • Tab completion
  • Conversation context
  • REPL-specific commands (\help, \tables, \schema)

Phase 6-8: Advanced Features ๐Ÿ”ฎ (Future)

  • Schema design from natural language
  • Migration generation
  • Output format enhancements

Phase 9-10: Production Ready ๐Ÿš€ (Future)

  • Comprehensive testing
  • CI/CD integration examples
  • Binary distribution (PyInstaller)
  • Package manager releases (Homebrew, apt)

๐Ÿค Contributing

This project follows the QueryNL Constitution:

  1. Security-First Design: All code must protect credentials and prevent SQL injection
  2. User Experience: Error messages must be actionable with clear next steps
  3. Transparency: Generated SQL must be shown before execution
  4. Multi-Database Parity: All supported databases must work equally well
  5. Fail-Safe Defaults: Destructive operations require confirmation

๐Ÿ“ License

Copyright ยฉ 2025 QueryNL Team

๐Ÿ™ Acknowledgments

  • Built with Click - Command-line framework
  • Rich - Beautiful terminal formatting
  • Keyring - Secure credential storage
  • Pydantic - Data validation

Version: 0.1.0 (MVP) Status: Production-ready for core features Last Updated: 2025-10-14

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

querynl_cli-0.2.0.tar.gz (340.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

querynl_cli-0.2.0-py3-none-any.whl (135.1 kB view details)

Uploaded Python 3

File details

Details for the file querynl_cli-0.2.0.tar.gz.

File metadata

  • Download URL: querynl_cli-0.2.0.tar.gz
  • Upload date:
  • Size: 340.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for querynl_cli-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ab988b90ae369af95f2183c5c9cdff2b875f371fcb8512a7d57be9b9272db71d
MD5 0d6fce08d9b335d068ac96b438d7b9a4
BLAKE2b-256 722346fc560dc153190615d7cda83ca41d0a07adc0d0af1fcaff8b0e412eaf87

See more details on using hashes here.

File details

Details for the file querynl_cli-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: querynl_cli-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 135.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for querynl_cli-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fb579d5f43f6b746c5695fd786d5cab037ba30b3583a298db3db03720aa91cc5
MD5 381e85f48e0843a21e91e73193f37b49
BLAKE2b-256 278864cd39fe4a7def4bb00b2e9105b44a4b7962ee917a333cdb3e10f5d5a42f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page