CTU Relational: SQL Database Datasets for Machine Learning
Project description
CTU Relational
The CTU Prague Relational Learning Repository was originally published in 2015 with a goal to support machine learning research with multi-relational data. Today, the repository is hosted on https://relational.fel.cvut.cz and contains more than 80 different datasets stored in SQL databases.
The RelBench project is currently seeking a similar goal of establishing the Relational Deep Learning as a new subfield of deep learning. The goal of this library is to support the effort of RelBench team by providing the CTU Relational datasets in the standardized representation. As such, the library is an extension of the RelBench package.
Installation
You can install CTU Relational package through pip:
pip install ctu-relational
Contents
:warning: The package is currenly in the development and contain only a subset of all available datasets. Rest will be added in the near future together with asociated tasks.
You can load datasets in same way as in the RelBench, e.g.:
from relbench.datasets import get_dataset
import ctu_relational
dataset = get_dataset('ctu-seznam') # automatically cached through the relbench package
db = dataset.get_db()
or directly from CTU Relational:
from ctu_relational import datasets as ctu_datasets
dataset = ctu_datasets.Seznam() # custom cache directory should be specified
db = dataset.get_db()
As opposed to the RelBench package, CTU Relational works directly with relational databases through the SQLAlchemy package. DBDataset
class provides a way of loading an SQL database in the RelBench format. You can load data from your SQL server with the following snippet.
from ctu_relational.datasets import DBDataset
custom_dataset = DBDataset(
dialect="mariadb", # other dialects should be supported but weren't tested
driver="mysqlconnector",
user=<user>,
password=<password>,
host=<host_url>,
port=3306,
database=<database_name>
)
db = custom_dataset.get_db(upto_test_timestamp=False)
Although, directly loaded databases usually need some additional touches. Take a look at ctu_datasets.py
for examples.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ctu_relational-0.2.1.tar.gz
.
File metadata
- Download URL: ctu_relational-0.2.1.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a0ac0867cd8ae8507eab4416c108e1a2dbd68a890fcafee8a1c7e1f8d35d64f |
|
MD5 | b25a6e0d02806b62ada6a9795bc17624 |
|
BLAKE2b-256 | 30f57ba6923770c7bba2689514e897129f2a481e3de0c4a32ad969bc6bf4265f |
File details
Details for the file ctu_relational-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: ctu_relational-0.2.1-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 995ebf9468caa6fac5e467d345c8b9685657b7bc696e1956e5b0ff552414ad08 |
|
MD5 | 6a57af4350988562a7b42704acd2c0dd |
|
BLAKE2b-256 | 41f4822434ed47476e2f3dbb515a959313f6be00a15e9f9f495f8b53a7868d68 |