Skip to main content

LLM data collection and synthetic fine-tuning dataset pipeline

Project description

DataForge

LLM data collection and synthetic fine-tuning dataset pipeline.

Installation

Via pip (any platform)

pip install dataforge

From source

git clone https://github.com/yourusername/website-explorer.git
cd website-explorer
pip install -e .

Standalone executables

Download pre-built executables for your platform from Releases:

  • Linux: dataforge-linux-x64
  • Windows: dataforge-windows-x64.exe
  • macOS: dataforge-macos-x64

Usage

dataforge

Development

Install development dependencies:

pip install -e ".[dev]"

Run tests:

pytest

Run linting:

ruff check src/ tests/

Publishing Releases

  1. Update version in pyproject.toml
  2. Commit changes
  3. Create a tag: git tag v0.1.0
  4. Push tag: git push origin v0.1.0

This will trigger:

  • Automated builds for Windows, macOS, and Linux
  • Publishing to PyPI
  • Creation of a GitHub Release with executables

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_web_crawler-0.1.0.tar.gz (49.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_web_crawler-0.1.0-py3-none-any.whl (54.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_web_crawler-0.1.0.tar.gz.

File metadata

  • Download URL: llm_web_crawler-0.1.0.tar.gz
  • Upload date:
  • Size: 49.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_web_crawler-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ddf0766df06bab8ece3f70e2a0254e4655c10559df4cebd1d00de6d487a0ad75
MD5 6ed62dc2d27dd475f08fc1388468bb1d
BLAKE2b-256 8b72d33d4006185116fce4cf0ce2de7c9d5f5e207632efa5302d712279afd49d

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_web_crawler-0.1.0.tar.gz:

Publisher: publish-pypi.yml on ianktoo/data-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_web_crawler-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_web_crawler-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d7e6f26f4d911465f57e2df3fca18d67936e1773109a40e163614d2f3737fbe3
MD5 797382c2d6b8a02a44f3b5032a0b9a60
BLAKE2b-256 bfe845152255b7eaab43d36d697ba61374ba101c1147f19cd00c911a9accb6bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_web_crawler-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on ianktoo/data-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page