A text-to-SQL pipeline using DeepSeek and ChromaDB
Project description
Lingua SQL
一个基于 DeepSeek 和 ChromaDB 的文本转 SQL(Text-to-SQL)流水线工具。
项目简介
Lingua SQL 旨在帮助用户通过自然语言问题自动生成 SQL 查询,支持自定义训练、数据库结构导入、示例问答训练等,适用于数据分析、智能问答等场景。
主要功能
- 支持 DeepSeek、OpenAI 等大模型 API
- 支持 ChromaDB 作为向量数据库
- 支持 MySQL 数据库结构自动导入
- 支持自定义 DDL、示例问答、文档训练
- 支持持久化和内存两种存储方式
- 提供丰富的训练与推理接口
安装方法
建议使用 Python 3.8 及以上版本。
pip install -r requirements.txt
或使用 pyproject.toml 进行依赖管理。
快速开始
1. 基本用法
import os
from dotenv import load_dotenv
from lingua_sql import LinguaSQL
# 加载环境变量
load_dotenv()
# 初始化 lingua_sql
nl = LinguaSQL(config={
"api_key": os.getenv("DEEPSEEK_API_KEY"),
"model": "deepseek-chat",
"client": "in-memory" # 可选 "persistent"
})
# 添加 DDL
nl.train(ddl="""
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
created_at TIMESTAMP
);
""")
# 添加示例问答
nl.train(
question="查询最近注册的5个客户",
sql="SELECT name, email, created_at FROM customers ORDER BY created_at DESC LIMIT 5;"
)
# 生成 SQL
question = "查询订单金额最高的前3个客户"
sql = nl.ask(question)
print(f"问题: {question}")
print(f"生成的 SQL: {sql}")
2. 数据库结构自动导入
from lingua_sql.database.mysql_connector import MySQLConnector
# 初始化数据库连接
conn = MySQLConnector(
host="localhost",
user="root",
password="your_password",
database="your_db"
)
conn.connect()
# 获取所有表结构并导入
for table in conn.get_all_tables():
ddl = ... # 参见 examples/database_usage.py
nl.train(ddl=ddl)
conn.disconnect()
更多用法请参考 examples/ 目录。
联系方式
作者:殷旭
邮箱:2337302325@qq.com
许可证
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lingua_sql-0.2.1.tar.gz
(45.4 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lingua_sql-0.2.1.tar.gz.
File metadata
- Download URL: lingua_sql-0.2.1.tar.gz
- Upload date:
- Size: 45.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c3bd0f20a7cdcf7fb355e4c4898ac7a2c9b73b408ed9fe116f43d8d517a0e62
|
|
| MD5 |
35e5f396f3dcea495fa1a12a67fb04eb
|
|
| BLAKE2b-256 |
305d57cd9674e1e184c37177bea02f0d254608b936517e5b41b4a44c25917d10
|
File details
Details for the file lingua_sql-0.2.1-py3-none-any.whl.
File metadata
- Download URL: lingua_sql-0.2.1-py3-none-any.whl
- Upload date:
- Size: 2.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d5ad80719da727ed8b714a6165cb93690519e8706ea220cd08aee0ddf8b70f2
|
|
| MD5 |
4788fabf594f99ed8022b9b758c46ab5
|
|
| BLAKE2b-256 |
b18b6bb414fcc301a8c1d6b01d5da8d663141fafa97578cf6c18c0367be5424a
|