Generate and insert realistic context-aware ai generative data into databases
Project description
RowGen: AI-Powered Fake Data Generator for SQL Databases
RowGen is a command-line tool that generates synthetic data and inserts it into your database. It uses AI to create realistic fake data based on your database schema.
Features
- AI-Powered Fake Data: Uses HuggingFace’s NLP models to generate realistic text, numbers, and structured data.
- SQL-Compatible: directly executes
INSERTstatements or export them into .sql files for easy database import. - Customizable Schemas: Define table structures and let RowGen fill in the rest.
- Poetry-Managed: Clean dependency management and virtual environments.
Use Case
Suppose you have a database such as the following:
book_author
| Column | Type | Constraints |
|---|---|---|
| author_id | SERIAL | PRIMARY KEY |
| name | VARCHAR(100) | NOT NULL |
| VARCHAR(100) | UNIQUE |
bookstore
| Column | Type | Constraints |
|---|---|---|
| store_id | SERIAL | PRIMARY KEY |
| name | VARCHAR(100) | NOT NULL |
| location | VARCHAR(255) | NOT NULL |
book
| Column | Type | Constraints |
|---|---|---|
| book_id | SERIAL | PRIMARY KEY |
| title | VARCHAR(200) | NOT NULL |
| publication_date | DATE | |
| price | NUMERIC(10, 2) | CHECK (price >= 0) |
| author_id | INTEGER |
Sample Generated Data
RowGen generates realistic sample data for this database, respecting foreign key relations and constraints:
| author_id | name | |
|---|---|---|
| 1 | Margaret Atwood | margaret.atwood@example.com |
| 2 | Haruki Murakami | haruki.murakami@example.com |
| 3 | J.K. Rowling | jk.rowling@example.com |
| 4 | George Orwell | george.orwell@example.com |
| 5 | Agatha Christie | agatha.christie@example.com |
| store_id | name | location |
|---|---|---|
| 1 | Book Haven | 123 Main St, New York, NY |
| 2 | Literary Corner | 456 Elm St, San Francisco, CA |
| 3 | Page Turner | 789 Oak St, Chicago, IL |
| 4 | Novel Nook | 101 Pine St, Seattle, WA |
| 5 | The Bookworm | 202 Maple St, Boston, MA |
| book_id | title | publication_date | price | author_id | store_id |
|---|---|---|---|---|---|
| 1 | The Handmaid's Tale | 1985-08-01 | 12.99 | 1 | 1 |
| 2 | Norwegian Wood | 1987-09-04 | 14.5 | 2 | 2 |
| 3 | Harry Potter and the Philosopher's Stone | 1997-06-26 | 10.99 | 3 | 3 |
| 4 | 1984 | 1949-06-08 | 9.99 | 4 | 4 |
| 5 | Murder on the Orient Express | 1934-01-01 | 11.25 | 5 | 5 |
| 6 | The Testaments | 2019-09-10 | 15.99 | 1 | 1 |
| 7 | Kafka on the Shore | 2002-09-12 | 13.75 | 2 | 2 |
| 8 | Harry Potter and the Chamber of Secrets | 1998-07-02 | 11.99 | 3 | 3 |
| 9 | Animal Farm | 1945-08-17 | 8.5 | 4 | 4 |
| 10 | And Then There Were None | 1939-11-06 | 10.99 | 5 | 5 |
Notes:
- Foreign keys (author_id, store_id) are linked correctly.
- Constraints such as NOT NULL and CHECK on price are respected.
- Email addresses are linked with mailto: for easy access.
Installation
pip install rowgen
Basic Usage
rowgen --user <username> --database <dbname> [options]
Using individual options
rowgen --db-type postgresql --host localhost --port 5432 --user myuser --database mydb
Using database URL
rowgen --db-url postgresql://user:password@localhost:5432/mydb
Operation Modes
Generate SQL File (Default)
Creates an inserts.sql file with the generated statements:
rowgen --user myuser --database mydb --rows 50
Execute Directly Against Database
rowgen --user myuser --database mydb --execute
Custom Output File
rowgen --user myuser --database mydb --output custom_inserts.sql
API Configuration
Provide API Key via Command Line
rowgen --user myuser --database mydb --apikey YOUR_HUGGINGFACE_API_KEY
Save API Key in Config
If no API key is provided, RowGen will prompt you to enter one and save it in ~/.config/rowgen/conf for future use.
Examples
Generate 100 rows for a PostgreSQL database and save to file:
rowgen --db-type postgresql --host db.example.com --user admin --database production --rows 100 --output prod_data.sql
Generate and immediately insert 25 rows into a MySQL database:
rowgen --db-type mysql --host localhost --user root --database test --execute
Use SQLite with direct execution:
rowgen --db-type sqlite --database /path/to/database.db --execute
Troubleshooting
-Connection Issues: Verify your database credentials and that the server is accessible
-API Key Problems: Check that your HuggingFace API key is valid and has sufficient permissions
-Permission Errors: Ensure you have write access to the output directory when saving to file
For more information, run:
rowgen --help
Prerequisites
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rowgen-0.2.1.tar.gz.
File metadata
- Download URL: rowgen-0.2.1.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e08b7106e616e23c8efad88727c4b4d37049c8242902cfcd36400c8996c2353d
|
|
| MD5 |
fbf002c3a7d4c401f1bce3384a8e1d16
|
|
| BLAKE2b-256 |
06221f3fb0a3320a236fb55fcc6cca10b7441c11c219c91de37408df464f4f26
|
File details
Details for the file rowgen-0.2.1-py3-none-any.whl.
File metadata
- Download URL: rowgen-0.2.1-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a0aee06b0a4bbadd519baa56548ea6cb8ae0a3974ef19ebecbb3088aa183dd1
|
|
| MD5 |
b25ad54fdfa925fb6467b807170d726e
|
|
| BLAKE2b-256 |
a797ffda824a975a3e37e23e0caabfff299c2f3b02cf670577f811ccf3393f5e
|