Skip to main content

Generating Realistic Tabular Data using Large Language Models

Project description

Generation of Realistic Tabular data
with pretrained Transformer-based language models

     

Our GReaT framework utilizes the capabilities of pretrained large language Transformer models to synthesize realistic tabular data. New samples are generated with just a few lines of code, following an easy-to-use API.

GReaT Installation

The GReaT framework can be easily installed using with pip:

pip install be-great

GReaT Quickstart

In the example below, we show how the GReaT approach is used to generate synthetic tabular data for the California Housing dataset.

from be_great import GReaT
from sklearn.datasets import fetch_california_housing

data = fetch_california_housing(as_frame=True).frame

model = GReaT(llm='distilgpt2', epochs=50)
model.fit(data)
synthetic_data = model.sample(n_samples=100)

GReaT Citation

If you use GReaT, please link or cite our work:

@article{
}

GReaT Acknowledgements

We sincerely thank the HuggingFace :hugs: framework.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

be_great-0.0.2.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

be_great-0.0.2-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file be_great-0.0.2.tar.gz.

File metadata

  • Download URL: be_great-0.0.2.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for be_great-0.0.2.tar.gz
Algorithm Hash digest
SHA256 e3c8e29711ce18ed86ff057bc59898c29abeb3d362e0436781366a400a01685f
MD5 7e5367a6e821742466cff814b1d99571
BLAKE2b-256 c8b60e5e4db181268840a9da3ac502c585ad2ad5fed931c5638288c26af10f8b

See more details on using hashes here.

File details

Details for the file be_great-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: be_great-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for be_great-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a1c4a10321919cb63cc57885dfba19980c2a1400140eda4682580674602ae52e
MD5 9beb464e6a5826b32d082eaf4171a712
BLAKE2b-256 d9d8808193b6fa808ed0ade19f7dfe21a7560b46d0f69ce81d30c73142b137c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page