Skip to main content

A text-to-SQL pipeline using DeepSeek and ChromaDB

Project description

Lingua SQL

一个基于 DeepSeek 和 ChromaDB 的文本转 SQL(Text-to-SQL)流水线工具。

项目简介

Lingua SQL 旨在帮助用户通过自然语言问题自动生成 SQL 查询,支持自定义训练、数据库结构导入、示例问答训练等,适用于数据分析、智能问答等场景。

主要功能

  • 支持 DeepSeek、OpenAI 等大模型 API
  • 支持 ChromaDB 作为向量数据库
  • 支持 MySQL 数据库结构自动导入
  • 支持自定义 DDL、示例问答、文档训练
  • 支持持久化和内存两种存储方式
  • 提供丰富的训练与推理接口

安装方法

建议使用 Python 3.8 及以上版本。

pip install -r requirements.txt

或使用 pyproject.toml 进行依赖管理。

快速开始

1. 基本用法

import os
from dotenv import load_dotenv
from lingua_sql import LinguaSQL

# 加载环境变量
load_dotenv()

# 初始化 lingua_sql
nl = LinguaSQL(config={
    "api_key": os.getenv("DEEPSEEK_API_KEY"),
    "model": "deepseek-chat",
    "client": "in-memory"  # 可选 "persistent"
})

# 添加 DDL
nl.train(ddl="""
CREATE TABLE customers (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(100),
    created_at TIMESTAMP
);
""")

# 添加示例问答
nl.train(
    question="查询最近注册的5个客户",
    sql="SELECT name, email, created_at FROM customers ORDER BY created_at DESC LIMIT 5;"
)

# 生成 SQL
question = "查询订单金额最高的前3个客户"
sql = nl.ask(question)
print(f"问题: {question}")
print(f"生成的 SQL: {sql}")

2. 数据库结构自动导入

from lingua_sql.database.mysql_connector import MySQLConnector

# 初始化数据库连接
conn = MySQLConnector(
    host="localhost",
    user="root",
    password="your_password",
    database="your_db"
)
conn.connect()

# 获取所有表结构并导入
for table in conn.get_all_tables():
    ddl = ... # 参见 examples/database_usage.py
    nl.train(ddl=ddl)
conn.disconnect()

更多用法请参考 examples/ 目录。

联系方式

作者:殷旭
邮箱:2337302325@qq.com

许可证

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lingua_sql-0.2.1.tar.gz (45.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lingua_sql-0.2.1-py3-none-any.whl (2.4 MB view details)

Uploaded Python 3

File details

Details for the file lingua_sql-0.2.1.tar.gz.

File metadata

  • Download URL: lingua_sql-0.2.1.tar.gz
  • Upload date:
  • Size: 45.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for lingua_sql-0.2.1.tar.gz
Algorithm Hash digest
SHA256 6c3bd0f20a7cdcf7fb355e4c4898ac7a2c9b73b408ed9fe116f43d8d517a0e62
MD5 35e5f396f3dcea495fa1a12a67fb04eb
BLAKE2b-256 305d57cd9674e1e184c37177bea02f0d254608b936517e5b41b4a44c25917d10

See more details on using hashes here.

File details

Details for the file lingua_sql-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: lingua_sql-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 2.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for lingua_sql-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2d5ad80719da727ed8b714a6165cb93690519e8706ea220cd08aee0ddf8b70f2
MD5 4788fabf594f99ed8022b9b758c46ab5
BLAKE2b-256 b18b6bb414fcc301a8c1d6b01d5da8d663141fafa97578cf6c18c0367be5424a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page