Skip to main content

CTU Relational: SQL Database Datasets for Machine Learning

Project description

CTU Relational

website PyPI version License: MIT

The CTU Prague Relational Learning Repository was originally published in 2015 with a goal to support machine learning research with multi-relational data. Today, the repository is hosted on https://relational.fel.cvut.cz and contains more than 80 different datasets stored in SQL databases.

The RelBench project is currently seeking a similar goal of establishing the Relational Deep Learning as a new subfield of deep learning. The goal of this library is to support the effort of RelBench team by providing the CTU Relational datasets in the standardized representation. As such, the library is an extension of the RelBench package.

Installation

You can install CTU Relational package through pip:

pip install ctu-relational

Contents

:warning: The package is currenly in the development and contain only a subset of all available datasets. Rest will be added in the near future together with asociated tasks.

You can load datasets in same way as in the RelBench, e.g.:

from relbench.datasets import get_dataset
import ctu_relational

dataset = get_dataset('ctu-seznam') # automatically cached through the relbench package
db = dataset.get_db()

or directly from CTU Relational:

from ctu_relational import datasets as ctu_datasets

dataset = ctu_datasets.Seznam() # custom cache directory should be specified
db = dataset.get_db()

As opposed to the RelBench package, CTU Relational works directly with relational databases through the SQLAlchemy package. DBDataset class provides a way of loading an SQL database in the RelBench format. You can load data from your SQL server with the following snippet.

from ctu_relational.datasets import DBDataset

custom_dataset = DBDataset(
            dialect="mariadb", # other dialects should be supported but weren't tested
            driver="mysqlconnector",
            user=<user>,
            password=<password>,
            host=<host_url>,
            port=3306,
            database=<database_name>
        )

db = custom_dataset.get_db(upto_test_timestamp=False)

Although, directly loaded databases usually need some additional touches. Take a look at ctu_datasets.py for examples.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctu_relational-0.3.1.tar.gz (33.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ctu_relational-0.3.1-py3-none-any.whl (39.5 kB view details)

Uploaded Python 3

File details

Details for the file ctu_relational-0.3.1.tar.gz.

File metadata

  • Download URL: ctu_relational-0.3.1.tar.gz
  • Upload date:
  • Size: 33.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ctu_relational-0.3.1.tar.gz
Algorithm Hash digest
SHA256 e02e4fe2a903d20b3ef3a570939650ee34a608d614a44a9f471bd7e9d36d2480
MD5 fc1fbb7f5b7922fc6e965a6767d58160
BLAKE2b-256 3bebf49d5745222aefbc0e63c8bb7c3bcf3d6032d1273fb95bc60e44e576b63f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ctu_relational-0.3.1.tar.gz:

Publisher: publish-to-pypi.yml on jakubpeleska/ctu-relational-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ctu_relational-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: ctu_relational-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 39.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ctu_relational-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6f48d035cc23b1fe26754d8f3b490bec101cb077811ef30881a88946e17f5e04
MD5 f3786ab0f863d5e333550f1e24c9d440
BLAKE2b-256 ee8efe3f8fda01e4b505a0625bb9b843ba28edd2331c8dab51e17b53ee34a9ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for ctu_relational-0.3.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on jakubpeleska/ctu-relational-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page