Skip to main content

Zero-API browser-based AI dataset generation. No API keys needed — just a browser and a prompt.

Project description

aigen-cli

Zero-API, browser-based AI dataset generation.

Turn any web AI (Gemini, ChatGPT, Claude, Perplexity) into a batch generation engine. No API keys. No per-call costs. Just your browser and a config file.

$20/mo subscription → unlimited generations
vs
$0.001–$0.024 per API call → ~$5 for 200 items

Install

Via NPM (no Python setup needed)

npx aigen-cli auth gemini
npx aigen-cli run aigen.yaml

# Or install globally:
npm install -g aigen-cli
aigen auth gemini

The NPM wrapper auto-installs the Python package on first run.

Via pip

pip install aigen-cli

# With semantic dedup + quality scoring:
pip install aigen-cli[ml]

# With Playwright backend:
pip install aigen-cli[playwright]

# Everything:
pip install aigen-cli[all]

Quick Start

1. Authenticate once

aigen auth gemini

A Chrome window opens — sign in, then close it. Your session is saved encrypted.

2. Create a config

aigen init

Interactive wizard generates aigen.yaml.

3. Generate

aigen run aigen.yaml

Headless browsers start, data flows into output/dataset.json.


Commands

Command Description
aigen auth <platform> One-time login. Saves encrypted browser session. Platforms: gemini, chatgpt, claude, perplexity
aigen run <config.yaml> Primary command. Auth-first headless generation from config
aigen generate Legacy generation via an already-running Chrome debug session
aigen init Interactive wizard to create a generation config
aigen doctor System health check (Chrome, Python, keyring, optional deps)
aigen status Show generation stats for an output directory
aigen packs List built-in domain config packs (medical, legal, ecommerce, code)
aigen push-hf Push a completed dataset to HuggingFace Hub
aigen sessions List, export, or import browser sessions
aigen schedule Add/remove/run cron-scheduled generation jobs
aigen mcp Run as an MCP server for AI agent integration

Run aigen --help or aigen <command> --help for full options.


Config File

project: "My Dataset"
description: "Short description"
target: 100              # Total items to generate
batch_size: 5            # Items per browser request
agents: 2                # Parallel browser tabs
platforms:
  - gemini
  - chatgpt

schema:
  fields: [question_text, expected_answer, difficulty, marks]
  required: [question_text, expected_answer, difficulty]

topic_pool:
  - chapter: "Chapter 1"
    topic: "Algebra"
    question_type: short_answer
    difficulty: easy
    marks: 3

output:
  format: json           # json | csv | jsonl
  path: output
  filename: dataset.json

# Optional: push to HuggingFace after generation
huggingface:
  repo_id: "your-username/dataset-name"
  private: false

# Optional: quality + compliance
quality:
  enabled: true          # requires pip install aigen-cli[ml]
  mode: heuristic
  threshold: 6.0

compliance:
  pii_detection: true
  auto_redact: true

Domain Packs

Ready-made configs for common use cases:

aigen packs                          # list available packs
aigen generate --pack medical_mcq    # run a built-in pack

Built-in packs: medical_mcq, legal_qa, ecommerce, code_review


Architecture

┌─────────────────────────────────────────────────────────┐
│                      aigen-cli                           │
│                                                          │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐ │
│  │ Gemini   │  │ ChatGPT  │  │  Claude  │  │Perplexity│ │
│  │  Tab     │  │   Tab    │  │   Tab    │  │   Tab    │ │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘ │
│       └──────────────┴─────────────┴──────────────┘       │
│                          │                                │
│              ┌───────────▼───────────┐                    │
│              │    BrowserPool        │                    │
│              │  (auth-first agents)  │                    │
│              └───────────┬───────────┘                    │
│                          │                                │
│              ┌───────────▼───────────┐                    │
│              │  GenerationEngine     │                    │
│              └───────────┬───────────┘                    │
│                          │                                │
│    ┌─────────────────────┼─────────────────────┐         │
│    │                     │                     │         │
│ ┌──▼──┐  ┌───────────┐ ┌▼────────┐  ┌────────▼──┐      │
│ │Parse│  │  Validate  │ │ Dedup   │  │  Output   │      │
│ │JSON │  │  (schema)  │ │ Engine  │  │ Writers   │      │
│ └─────┘  └───────────┘ └─────────┘  └───────────┘      │
└─────────────────────────────────────────────────────────┘

Why This Exists

API costs scale linearly with usage. Generating 1000 items via API can cost $20–50. The same models via web interface are included in a flat $20/month subscription.

aigen-cli automates the web interface so you get:

  • Unlimited generations — rate-limited by platform anti-bot, not your billing
  • Zero API keys — just log in once like a normal user
  • Full transparency — every raw response is saved for audit
  • No vendor lock-in — swap platforms instantly via config

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aigen_cli-1.0.0.tar.gz (69.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aigen_cli-1.0.0-py3-none-any.whl (80.2 kB view details)

Uploaded Python 3

File details

Details for the file aigen_cli-1.0.0.tar.gz.

File metadata

  • Download URL: aigen_cli-1.0.0.tar.gz
  • Upload date:
  • Size: 69.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aigen_cli-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d8746d9cb335adb81b97e87e0095d3a64a16cb6eb7f68d6d64c6122df23c2b74
MD5 eedb0d81897385dda31dd8e5e8ad9f3b
BLAKE2b-256 29d1b7ae95437d3afae372d442adb4d5bbe9249dbe7ea7e13fd35ada0a8f4683

See more details on using hashes here.

File details

Details for the file aigen_cli-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: aigen_cli-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 80.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aigen_cli-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0360e03bae8b1f3435e195c0c7e0e3d91afed2e303352c92c69c4692ec9ce2f5
MD5 739f2b190e273686418f9699feb42e1c
BLAKE2b-256 c371ffc537c63e724a8855d068f463c0b4823d9c3879eeb07eb7653cef5b2bf5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page