Skip to main content

RWTH Aachen Computer Science i5/dbis assets for Lecture Datenbanken und Informationssysteme

Project description

DBIS Relational Algebra

pypi PyPI Status

This library provides a Python implementation of the relational algebra.

Features

  • Create expressions of the relational algebra in python.
  • Load data from SQLite tables.
  • Evaluate expressions on the data.
  • Convert these expressions to text in LaTeX math mode.
  • Convert a relation / the result of an expression to a Markdown table.

Installation

Install via pip:

pip install dbis-relational-algebra

Usage

Overview of supported operators

  • Cross Product / Cartesian Product (*)
  • Difference (-)
  • Division (/)
  • Intersection (&)
  • Left Semijoin
  • Natural Join
  • Projection
  • Rename
  • Right Semijoin
  • Selection
  • Theta Join
  • Union (|)

The set operators Union, Intersection, and Difference require the relations to be union-compatible.

Formulas

For the Theta Join and the Selection, a formula is used to specify the join or selection condition. These formulas can be created using the following operators:

  • And
  • Or
  • Not
  • Equals
  • GreaterEquals
  • GreaterThan
  • LessEquals
  • LessThan

In the comparators, two values have to be specified. At least one of these values must be a python str, which references a column of the relation.

Loading data & Evaluating an expression

To load data, an SQLite connection can be used (recommended). This connection must be passed to the relational algebra expression for the evaluation.

It is also possible to load a relation with data by hand (not recommended):

relation = Relation(name="R")
relation.add_attributes(["a", "b", "c"])
relation.add_rows([
	[1, 2, 3],
	[4, 5, 6],
	[7, 8, 9],
])

An expression can be created by using the operators and formulas listed above. The expression can then be evaluated on the data:

# Cross Product RxS, see above
expression = Relation("R") * Relation("S")
result = expression.evaluate(sql_con=connection)
# Theta Join R.a = S.b, see above
expression = ThetaJoin("R", "S", Not(Equals("R.a", "S.b")))
result = expression.evaluate(sql_con=connection)

The rows and column names of a relation (result) are accessible using the following attributes:

result.attributes # list of column names (str)
result.rows # set of rows (tuple)

Best practices:

  • After joining two relations or the cross product of two relations, you should always give column names that appear in both relations a new distinct name.
  • After joining two relations, the cross product of two relations, or some set operation on two relations, you should always give the resulting relation a new distinct name.
  • When referencing a column in a comparator, it is recommended that this column should be referred to using a detailed description, i.e. refer to column a of relation R as "R.a" instead of "a".

Developer Notes

A few design choices were made:

  • Internally, the data is stored in a pandas DataFrame. This accelerates the relational algebra operators greatly.
  • In relational algebra, a column a from a relation R can be referred to as a and R.a. Internally, the column name is always stored using the full name, i.e. R.a. This is done to avoid ambiguities when a column a is present in multiple relations.
  • When joining two relations (or also cross product), the relational algebra provides no guidelines on how the resulting relation should be named. Thus, if a is a column of relation R, joining relations R and S results in a relation, where R.a and S.a might refer to this column a (depending on if a also references a column in S). Thus, generally speaking, joining two relations R and S will internally result in a relation named R+S, and the column R.a will now be named R+S.a (if there is no column S.a).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbis_relational_algebra-1.1.8.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbis_relational_algebra-1.1.8-py3-none-any.whl (33.8 kB view details)

Uploaded Python 3

File details

Details for the file dbis_relational_algebra-1.1.8.tar.gz.

File metadata

  • Download URL: dbis_relational_algebra-1.1.8.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for dbis_relational_algebra-1.1.8.tar.gz
Algorithm Hash digest
SHA256 6a0d1d7e9a6748c7f0480592d750ce889d288c669b766fc653e5c575b95a0c40
MD5 628b7674355f47d4ee4af46f4eadc783
BLAKE2b-256 d2d5791438a577c7558fbf67f18525357e27a46a16c0ac39a1bf1ef2dfba3f9a

See more details on using hashes here.

File details

Details for the file dbis_relational_algebra-1.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for dbis_relational_algebra-1.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 1b6c178498a3be65ecb6616f9d33d00442c1e00f7af0f6741a9a418ab4c3b0b7
MD5 665d269503656d8a2cd33b4d980d953b
BLAKE2b-256 5770b8a4f10828dc5084e62ec5327d4acd53c0a655dc04633565dc83dd6d560a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page