Skip to main content

RWTH Aachen Computer Science i5/dbis assets for Lecture Datenbanken und Informationssysteme

Project description

DBIS Relational Algebra

pypi PyPI Status

This library provides a Python implementation of the relational algebra.

Features

  • Create expressions of the relational algebra in python.
  • Load data from SQLite tables.
  • Evaluate expressions on the data.
  • Convert these expressions to text in LaTeX math mode.
  • Convert a relation / the result of an expression to a Markdown table.

Installation

Install via pip:

pip install dbis-relational-algebra

Usage

Overview of supported operators

  • Cross Product / Cartesian Product (*)
  • Difference (-)
  • Division (/)
  • Intersection (&)
  • Left Semijoin
  • Natural Join
  • Projection
  • Rename
  • Right Semijoin
  • Selection
  • Theta Join
  • Union (|)

The set operators Union, Intersection, and Difference require the relations to be union-compatible.

Formulas

For the Theta Join and the Selection, a formula is used to specify the join or selection condition. These formulas can be created using the following operators:

  • And
  • Or
  • Not
  • Equals
  • GreaterEquals
  • GreaterThan
  • LessEquals
  • LessThan

In the comparators, two values have to be specified. At least one of these values must be a python str, which references a column of the relation.

Loading data & Evaluating an expression

To load data, an SQLite connection can be used (recommended). This connection must be passed to the relational algebra expression for the evaluation.

It is also possible to load a relation with data by hand (not recommended):

relation = Relation(name="R")
relation.add_attributes(["a", "b", "c"])
relation.add_rows([
	[1, 2, 3],
	[4, 5, 6],
	[7, 8, 9],
])

An expression can be created by using the operators and formulas listed above. The expression can then be evaluated on the data:

# Cross Product RxS, see above
expression = Relation("R") * Relation("S")
result = expression.evaluate(sql_con=connection)
# Theta Join R.a = S.b, see above
expression = ThetaJoin("R", "S", Not(Equals("R.a", "S.b")))
result = expression.evaluate(sql_con=connection)

The rows and column names of a relation (result) are accessible using the following attributes:

result.attributes # list of column names (str)
result.rows # set of rows (tuple)

Best practices:

  • After joining two relations or the cross product of two relations, you should always give column names that appear in both relations a new distinct name.
  • After joining two relations, the cross product of two relations, or some set operation on two relations, you should always give the resulting relation a new distinct name.
  • When referencing a column in a comparator, it is recommended that this column should be referred to using a detailed description, i.e. refer to column a of relation R as "R.a" instead of "a".

Developer Notes

A few design choices were made:

  • Internally, the data is stored in a pandas DataFrame. This accelerates the relational algebra operators greatly.
  • In relational algebra, a column a from a relation R can be referred to as a and R.a. Internally, the column name is always stored using the full name, i.e. R.a. This is done to avoid ambiguities when a column a is present in multiple relations.
  • When joining two relations (or also cross product), the relational algebra provides no guidelines on how the resulting relation should be named. Thus, if a is a column of relation R, joining relations R and S results in a relation, where R.a and S.a might refer to this column a (depending on if a also references a column in S). Thus, generally speaking, joining two relations R and S will internally result in a relation named R+S, and the column R.a will now be named R+S.a (if there is no column S.a).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbis-relational-algebra-1.1.6.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

dbis_relational_algebra-1.1.6-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file dbis-relational-algebra-1.1.6.tar.gz.

File metadata

File hashes

Hashes for dbis-relational-algebra-1.1.6.tar.gz
Algorithm Hash digest
SHA256 bf049fbfed7ae942c1b4f6c3610f4376405d665c9e1a9895e8cef4d3edd2ca69
MD5 81817757f02d6960debe64612cb81640
BLAKE2b-256 aed9a3248b1773d88b37b04d7d06fbff85bb84ca1ab6e355023df36b992dd606

See more details on using hashes here.

File details

Details for the file dbis_relational_algebra-1.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for dbis_relational_algebra-1.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 9ee06c2b7f582ea3b69e8a92faa7e9cc266aa044927e5e01fb5fee26679f0096
MD5 4b1796340b877badfcd35caca9cdb6a0
BLAKE2b-256 85a63776b7bc9bfe59ef992e082b20269526c4ec330c21aefeaf2002286f11bd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page