Skip to main content

RWTH Aachen Computer Science i5/dbis assets for Lecture Datenbanken und Informationssysteme

Project description

DBIS Relational Algebra

pypi PyPI Status

This library provides a Python implementation of the relational algebra.

Features

  • Create expressions of the relational algebra in python.
  • Load data from SQLite tables.
  • Evaluate expressions on the data.
  • Convert these expressions to text in LaTeX math mode.
  • Convert a relation / the result of an expression to a Markdown table.

Installation

Install via pip:

pip install dbis-relational-algebra

Usage

Overview of supported operators

  • Cross Product / Cartesian Product (*)
  • Difference (-)
  • Division (/)
  • Intersection (&)
  • Left Semijoin
  • Natural Join
  • Projection
  • Rename
  • Right Semijoin
  • Selection
  • Theta Join
  • Union (|)

The set operators Union, Intersection, and Difference require the relations to be union-compatible.

Formulas

For the Theta Join and the Selection, a formula is used to specify the join or selection condition. These formulas can be created using the following operators:

  • And
  • Or
  • Not
  • Equals
  • GreaterEquals
  • GreaterThan
  • LessEquals
  • LessThan

In the comparators, two values have to be specified. At least one of these values must be a python str, which references a column of the relation.

Loading data & Evaluating an expression

To load data, an SQLite connection can be used (recommended). This connection must be passed to the relational algebra expression for the evaluation.

It is also possible to load a relation with data by hand (not recommended):

relation = Relation(name="R")
relation.add_attributes(["a", "b", "c"])
relation.add_rows([
	[1, 2, 3],
	[4, 5, 6],
	[7, 8, 9],
])

An expression can be created by using the operators and formulas listed above. The expression can then be evaluated on the data:

# Cross Product RxS, see above
expression = Relation("R") * Relation("S")
result = expression.evaluate(sql_con=connection)
# Theta Join R.a = S.b, see above
expression = ThetaJoin("R", "S", Not(Equals("R.a", "S.b")))
result = expression.evaluate(sql_con=connection)

The rows and column names of a relation (result) are accessible using the following attributes:

result.attributes # list of column names (str)
result.rows # set of rows (tuple)

Best practices:

  • After joining two relations or the cross product of two relations, you should always give column names that appear in both relations a new distinct name.
  • After joining two relations, the cross product of two relations, or some set operation on two relations, you should always give the resulting relation a new distinct name.
  • When referencing a column in a comparator, it is recommended that this column should be referred to using a detailed description, i.e. refer to column a of relation R as "R.a" instead of "a".

Developer Notes

A few design choices were made:

  • Internally, the data is stored in a pandas DataFrame. This accelerates the relational algebra operators greatly.
  • In relational algebra, a column a from a relation R can be referred to as a and R.a. Internally, the column name is always stored using the full name, i.e. R.a. This is done to avoid ambiguities when a column a is present in multiple relations.
  • When joining two relations (or also cross product), the relational algebra provides no guidelines on how the resulting relation should be named. Thus, if a is a column of relation R, joining relations R and S results in a relation, where R.a and S.a might refer to this column a (depending on if a also references a column in S). Thus, generally speaking, joining two relations R and S will internally result in a relation named R+S, and the column R.a will now be named R+S.a (if there is no column S.a).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbis_relational_algebra-1.1.8.post1.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbis_relational_algebra-1.1.8.post1-py3-none-any.whl (33.8 kB view details)

Uploaded Python 3

File details

Details for the file dbis_relational_algebra-1.1.8.post1.tar.gz.

File metadata

File hashes

Hashes for dbis_relational_algebra-1.1.8.post1.tar.gz
Algorithm Hash digest
SHA256 eb86d2b93973cfab3d0cd4756e864e5c9d96984ffc054c77cbaac3537eabe1ad
MD5 258b5a93a5ac0f74ea5f26110138002c
BLAKE2b-256 3e0c95c61722877ef81642eec771f2fb9ae11cd8b2af68f881fe54f896bc4ebc

See more details on using hashes here.

File details

Details for the file dbis_relational_algebra-1.1.8.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for dbis_relational_algebra-1.1.8.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 daa825f0940ef3cb6ea4eb8660fbb2154aae2384eb2603f5202dae519549c7a8
MD5 66280c697f8095650a7cfffcd8b0165b
BLAKE2b-256 71fd89033a069a1040852b1507572bee2a9c5cea9e2b0862de0da460af9b1a07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page