Skip to main content

A modular analysis framework

Project description

A software library for scientists written by scientists!

Pipeline Status coverage report Latest Release

Description

Modular Analysis Framework is a python tool to run analytical steps in a consistent manner and to generate suitable output graphs and tables.

The idea behind MAFw is to offer data scientists a framework where they will be able to implement complex analytical tasks in a well defined environment where they can focus only on the data analysis without bothering with all other ancillary things, like interfaces to database, job submission and so on.

The core of MAFw is the Processor, the class that is responsible to perform the analytical task. The Processor I/O is based on a strong collaboration between a relational database structure and files on disc. In general, the processor is gathering the relevant input from one or more DB tables (location of input files, processing parameters…), performing its analytical job and update a DB output table with the main outcomes including the location where the output files are saved on disc.

By inheriting from the base Processor class, user-developed processors will come with some superpowers, like the ability to exchange data with the database back-end, displaying progress to the user, generating output graphs and so on. The scientist tasks will be limited to the implementation of the analysis code.

Once the data scientists have created their processor libraries, they will be able to chain them one after the other in a very simple way inside a so-called steering file and MAFw will take care to run them.

A full documentation of the library API along with a general description is available here.

Installation

MAFw can be installed using pip in a separated virtual environment.

D:\mafw>python -m venv mafw-env
D:\mafw>cd mafw-env
D:\mafw\mafw-env>Scripts\activate
(mafw-env) D:\mafw\mafw-env>pip install mafw

MAFw dependencies will be automatically installed by pip.

Usage

The project’s documentation is available here also as a PDF file.

Contributing

Contributions to the software development are very much welcome.

If you want to join the developer efforts, the best way is to clone/fork this repository on your system and start working.

The development team has adopted hatch for basic tasks. So, once you have downloaded the git repository to your system, open a shell there and type:

D:\mafw> hatch env create dev
D:\mafw> hatch env find dev
C:\path\to\.venv\mafw\KVhWIDtq\dev.py3.11
C:\path\to\.venv\mafw\KVhWIDtq\dev.py3.12
C:\path\to\.venv\mafw\KVhWIDtq\dev.py3.13

to generate the python environments for the development. This command will actually create the whole environment matrix, that means one environment for each supported python version. If you intend to work primarily with one single python version, simply specify it in the create command, for example:

D:\mafw> hatch env create dev.py3.13
D:\mafw> hatch env find dev.py3.13
C:\path\to\.venv\mafw\KVhWIDtq\dev.py3.13

hatch will take care of installing MAFw in development mode with all the required dependencies. Use the output of the find command, if you want to add the same virtual environment to your favorite IDE. Once done, you can spawn a shell in the development environment just be typing:

D:\mafw> hatch shell dev.py3.13
(dev.py3-13) D:\mafw>

and from there you can simply run mafw and all other scripts.

MAFw uses pre-commit to assure a high quality code. The pre-commit package will be automatically installed into your environment, but it needs to be initialised before first use. So just enter:

(dev.py3-13) D:\mafw> pre-commit install

And now you are really ready to go with your coding!

Before pushing all your commits to the remote branch, we encourage you to run the pre-push tests to be sure that everything still works as expected. You can do this by typing:

D:\mafw> hatch run dev.py3-13:pre-push

if you are not in an activated development shell, or

(dev.py3-13) D:\mafw> hatch run pre-push

if you are already in the dev environment.

Testing

MAFw comes with an extensive unit test suite of more than 1000 test cases for an overall code coverage of 99%.

Tests have been coded using pytest best practice and are aiming to prove the functionality of each unit of test taken individually. Given the high level of interoperability of MAFw with other libraries (toml, peewee and seaborn just to name a few), unit tests rely heavily on patched object to assure reproducibility.

Nevertheless a full integration test is also included in the test suite. These tests will cover all relevant aspects of MAFw, including:

  1. Installation of MAFw and of a Plugin project in a isolated environment

  2. Use of MAFw executable to create some data files and analyse them to create a graphical output.

  3. Use of a database to store the collected data.

  4. Check the database trigger functionalities to avoid repeating useless analysis steps, for example when a new file is added, removed or changed.

If you plan to collaborate in the development of MAFw, you must include unit tests for your contributions.

As already mentioned, MAFw is using hatch as project management. In the pyproject.toml file, hatch is configured to have a matrix of test environment in order to run the whole test suite with the supported version of pythons (3.11, 3.12 and 3.13).

Running the suite is very easy. Navigate to the folder where you have your local copy of MAFw and type hatch test. Hatch will take care of installing the proper environment and run the tests. Should one or more test(s) fail, then the slow integration tests will be skipped to spare some time.

Have a look at the hatch test options, in particular the -a, to test over all the environments in the matrix and the -c to generate coverage data for the production of a coverage report.

Authors and acknowledgment

Antonio Bulgheroni Michael Krachler

License

This software is licensed under EUPL 1.2

Project status

Ready to crunch some data! Open for contributions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mafw-1.2.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mafw-1.2.0-py3-none-any.whl (124.4 kB view details)

Uploaded Python 3

File details

Details for the file mafw-1.2.0.tar.gz.

File metadata

  • Download URL: mafw-1.2.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for mafw-1.2.0.tar.gz
Algorithm Hash digest
SHA256 58cb3fbadaaa8dd296506daeba56503a2828489a187fe37be7ba601c1e5b7aa0
MD5 2914cd751b1e41708b9da444078f8d7c
BLAKE2b-256 37d6f0518509e60e1dea11314c460784d32111799c971b3633941a2947502ed8

See more details on using hashes here.

File details

Details for the file mafw-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: mafw-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 124.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for mafw-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 49f7f1533def597faa40bb810cd1e8a8246b38d7c7e8212941c13e761148b3cc
MD5 4171793cdde3e6b49bb119e02d1bc9ca
BLAKE2b-256 b69cf0ef7b0de1c0d9ccadd8baabee4de5c9019ceeb5211b13e481c29ff35460

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page