Skip to main content

Package for lynguine data oriented architecture interfaces.

Project description

lynguine

Tests

codecov

The lynguine library is a library based on DOA principles for managing data. It provides these capabilities to support other libraries like lamd and referia.

To install use

%pip install lynguine

The softare consists of TK principle parts.

Config

First config which consists of interface and context.

context defines the Context object which is used to store information about the context, such as machine type etc. On the other handinterface defines the Interface object that's used for defining inputs and outputs which defines connections to other 'black box processes'.

A short example

The local context can be loaded in using the following command.

import lynguine as ln

ctxt = ln.config.context.Context()

The interface module contains the key structure of lynguine. It specifies incoming and outgoing flows, as well as computational operations. Each flow is specified in the following form.

input:
  source:

Preprocessing can be done with a compute field.

input:
  compute:
    field: ColumnName0
    function: computeFunction
    args:
      arg1: argument1
      arg2: argument2
    row_args:
      arg3: ColumnName1

Often the data will be stored in another file (csv, excel, yaml etc) but sometimes it's convenient to store it as a local in a field calld 'data'. In the next example we do this to illustrate how the compute capability can be used to augment the file. Here two fields are added, the full name (used as an index) and today's date as an access date.

import yaml
from lynguine.config.interface import Interface
from lynguine.assess.data import CustomDataFrame

# Let's assume this is the text stored in the interface file
yaml_text = """input:
  type: local
  index: fullName
  data:
  - familyName: Xing
    givenName: Pei
  - familyName: Venkatasubramanian
    givenName: Siva
  - familyName: Paz Luiz
    givenName: Miguel
  compute:  # compute is used for preprocessing as data is loaded
  - field: fullName # the field fullName is created from this compute command
    function: render_liquid
    args: # keyword arguments to pass to the function
      template: '{{familyName | replace: " ", "-"}}_{{givenName | replace: " ", "-"}}' # The liquid template allows us to combine the names
    row_args: # arguments are taken from the same row
      givenName: givenName 
      familyName: familyName
  - field: accessDate
    function: today"""

interface = Interface(yaml.safe_load(yaml_text))

data = CustomDataFrame.from_flow(interface)
print(data)

would create a new field fullname which is then used as the index.

Access

Secondly the software uses the access, assess, address decomposition. Where access is used for accessing data and consists of io and download. io allows for reading from and writing to various different file formats such as json, yaml, markdown, csv, xls, bibtex.

download is for accessing resources from the web, such as downloading a specific url.

A short example

Perhaps you would like to create a bibtex file from the PMLR proceedings volume 1, Gaussian processes in practice. In the short example below, we use lynguine to first download the relevant URL, then we load it in and save as bibtex.

import lynguine

Assess

Assess is about taking the raw data and processing it. Under assess lynguine provides data and compute. The data module provides a CustomDataFrame object that provides access to the data and manipulation capabilities. the compute module wraps various compute capabilities for preprocessing and processing the data.

A short example

import lynguine

Util

The util module provides various utilities for working with data. They include

  • dataframe for operating on data frames.
  • fake for generating fake data.
  • files for interacting with files.
  • html for working with html.
  • liquid for working with the liquid template language.
  • talk for working with Neil's talk format.
  • tex for working with latex.
  • text for working with text.
  • yaml for working with yaml.
  • misc for other miscellaneous utilities.

A short example

import lynguine

Tests

The tests are stored in the tests subdirectory. They use pytest.

A short example

If you have poetry installed you can run the tests using

poetry run pytest 

Why lynguine?

The name comes from the idea that data oriented architecture is like a set of streams of data, like linguine pasta. In Italian the word also means "little toungues" so there's also a connotation of translation between services.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lynguine-0.1.1.tar.gz (494.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lynguine-0.1.1-py3-none-any.whl (137.4 kB view details)

Uploaded Python 3

File details

Details for the file lynguine-0.1.1.tar.gz.

File metadata

  • Download URL: lynguine-0.1.1.tar.gz
  • Upload date:
  • Size: 494.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Darwin/23.5.0

File hashes

Hashes for lynguine-0.1.1.tar.gz
Algorithm Hash digest
SHA256 17a8efd23108a8f0145153d3af5659fd1aa8130407ddf872663399fa4608a469
MD5 f2799209c89339f510e0907df0645037
BLAKE2b-256 84e08d408f0576419559f4801af0eff16daf50f0d0c5124cab80a940aabc9ef2

See more details on using hashes here.

File details

Details for the file lynguine-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: lynguine-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 137.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Darwin/23.5.0

File hashes

Hashes for lynguine-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8beca781c01e2de1cd5cf56e27a553f269373eac1e71215a09d267f1240d4f3e
MD5 c36a383bf7db2b29b17628bc5dcef21c
BLAKE2b-256 414ca7bc96f8c53ad6a966ecd34f1c30aeb462b0f85961823c19e81ccc259f35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page