Skip to main content

DBgen (Database Generator) is an open-source Python library for connecting raw data, scientific theories, and relational databases

Project description

DBgen

DBgen

Test Publish Package version

---

Documentation: https://dbgen.modelyst.com

Github: https://github.com/modelyst/dbgen


:exclamation: Please note that this project is actively under major rewrites and installations are subject to breaking changes.


DBgen (Database Generator) is an open-source Python library for connecting raw data, scientific theories, and relational databases. The package was designed with a focus on the developer experience at the core. DBgen was initially developed by Modelyst.

What is DBgen?

DBgen was designed to support scientific data analysis with the following characteristics:

  1. Transparent

    • Because scientific efforts ought be shareable and mutually understandable.
  2. Flexible

    • Because scientific theories are under continuous flux.
  3. Maintainable

    • Because the underlying scientific models one works with are complicated enough on their own, we can't afford to introduce any more complexity via our framework.

DBGen is an opinionated ETL tool. While many other ETL tools exist, they rarely give the tools necessary for a scientific workflow. DBGen is a tool that helps populate a single postgresql database using a transparent, flexible, and mainatable data pipeline.

Alternative tools

Orchestrators: Many tools exist to orchestrate python workflows. However, these tools often often are too general to help the average scientist wrangle their data or are so specific to storing a given workflow type they lack the flexibility needed to address the specifics of a scientist's data problems. Many other tools also come packaged with powerful

General Orchestration Tools

  1. Airflow
  2. Prefect
  3. Luigi

Computational Science Workflow Tools

  1. Fireworks
  2. AiiDA
  3. Atomate

What isn't DBgen?

  1. An ORM tool (see Hibernate for Java or SQLAlchemy for Python)

    • DBGen utilizes the popular SQLAlchemy ORM to operate at an even higher level extraction, allowing the users to build pipelines and schema without actively thinking about the database tables or insert and select statements required to connect the workflow together.
  2. A database manager (see MySQLWorkbench, DBeaver, TablePlus, etc.)

  3. An opinionated tool with a particular schema for scientific data / theories.

Getting DBgen

Via Github

Currently, the only method of installing DBgen is through Github. This is best done by using the poetry package manager. To do this, first clone the repo to a local directory. Then use the command poetry install in the directory to install the required dependencies. You will need at least python 3.7 to install the package.

# Get DBgen
git clone https://github.com/modelyst/dbgen
cd ./dbgen
# Get Poetry
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python3 -
# Install Poetrywhich ma
poetry install
poetry shell
# Test dbgen
dbgen serialize dbgen.example.main:make_model

Via Pip

pip install modelyst-dbgen

API documentation

Documentation of modules and classes can be found in API docs </modules>.

Reporting bugs

Please report any bugs and issues at DBgen's Github Issues page.

License

DBgen is released under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelyst-dbgen-0.6.1.tar.gz (85.4 kB view details)

Uploaded Source

Built Distribution

modelyst_dbgen-0.6.1-py3-none-any.whl (122.6 kB view details)

Uploaded Python 3

File details

Details for the file modelyst-dbgen-0.6.1.tar.gz.

File metadata

  • Download URL: modelyst-dbgen-0.6.1.tar.gz
  • Upload date:
  • Size: 85.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.9.10 Linux/5.11.0-1028-azure

File hashes

Hashes for modelyst-dbgen-0.6.1.tar.gz
Algorithm Hash digest
SHA256 0059460e8fa575909256e4484f4aa3af6532b5685350a005e5a6070d2ff22d54
MD5 bcda29e563fef791b06b0a7f6846bc2f
BLAKE2b-256 42f6d38c25aac387479dca3f840669b8b668fa4eaeadfa3c6c92917daba5403c

See more details on using hashes here.

File details

Details for the file modelyst_dbgen-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: modelyst_dbgen-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 122.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.9.10 Linux/5.11.0-1028-azure

File hashes

Hashes for modelyst_dbgen-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5a43ddc72ab1652bb8649858527d23a54b4690e4233f089b4d4fc6fc390e4793
MD5 3228847501bffd5d9dae47d3bbc80f8a
BLAKE2b-256 7f9ae8446d48eb7a98149148c8a614b41d7a6f9e1828ef1f665716faedb1d041

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page