Skip to main content

Core tools for use on projects by Oreum Industries

Project description

Oreum Core Tools oreum_core

Python License GitHub Release PyPI CI publish code style: ruff code style: interrogate code security: bandit


1. Description and Scope

oreum_core is an ever-evolving package of core tools for use on client projects by Oreum Industries.

  • Provides an essential workflow for data curation, EDA, basic ML using the core scientific Python stack incl. numpy, scipy, matplotlib, seaborn, pandas, scikit-learn, umap-learn
  • Optionally provides an advanced Bayesian modeling workflow in R&D and Production using a leading probabilistic programming stack incl. pymc, pytensor, arviz (do pip install oreum_core[pymc])
  • Optionally enables a generalist black-box ML workflow in R&D using a leading Gradient Boosted Trees stack incl. catboost, xgboost, optuna, shap (do pip install oreum_core[tree])
  • Also includes several utilities for text cleaning, sql scripting, file handling

This package is:

  • A work in progress (v0.y.z) and liable to breaking changes and inconvenience to the user
  • Solely designed for ease of use and rapid development by employees of Oreum Industries, and selected clients with guidance

This package is not:

  • Intended for public usage and will not be supported for public usage
  • Intended for contributions by anyone not an employee of Oreum Industries, and unsolicited contributions will not be accepted.

Notes

  • Project began on 2021-01-01
  • The README.md is MacOS and POSIX oriented
  • See LICENCE.md for licensing and copyright details
  • See pyproject.toml for various package details
  • This uses a logger named 'oreum_core', feel free to incorporate or ignore see __init__.py for details
  • Hosting:
    • Source code repo on GitHub
    • Source code release on GitHub
    • Package release on PyPi
  • Implementation:
    • This project is enabled by a modern, open-source, advanced software stack for data curation, statistical analysis and predictive modelling
    • Specifically we use an open-source Python-based suite of software packages, the core of which is often known as the Scientific Python stack, supported by NumFOCUS
    • Once installed (see section 2), see LICENSES_3P.md for full details of all package licences
  • Environments: this project was originally developed on a Macbook Air M2 (Apple Silicon ARM64) running MacOS 15 (Sequoia) using osx-arm64 Accelerate

2. Instructions to Create Dev Environment

For local development on MacOS

2.0 Pre-requisite installs via homebrew

  1. Install Homebrew, see instructions at https://brew.sh
  2. Install system-level tools incl. direnv, gcc, git, graphviz, uv:
$> make brew

2.1 Git clone the repo

Assumes system-level tools installed as above:

$> git clone https://github.com/oreum-industries/oreum_core
$> cd oreum_core

Then allow direnv on MacOS to autorun file .envrc upon directory open

2.2 Create virtual environment and install dev packages

Notes:

  • We use local .venv/ virtual env via uv
  • Packages are technically articulated in pyproject.toml and might not be the latest - to aid stability for pymc (usually in a state of development flux)

2.2.1 Create the dev environment

From the dir above oreum_core/ project dir:

$> make -C oreum_core/ dev

This will also create some files to help confirm / diagnose successful installation:

  • dev/install_log/blas_info.txt for the BLAS MKL installation for numpy
  • LICENSES_3P.md details the license for each third-party package used

2.2.2 (Optional best practice) Test successful installation of dev env

From the dir above oreum_core/ project dir:

$> make -C oreum_core/ dev-test

This will also add files dev/install_log/tests_[numpy|scipy].txt which detail successful installation (or not) for numpy, scipy

2.2.3 (Useful during env install experimentation): To remove the dev env

From the dir above oreum_core/ project dir:

$> make -C oreum_core/ dev-uninstall

2.3 Code Linting & Repo Control

2.3.1 Pre-commit

We use pre-commit to run a suite of automated tests for code linting & quality control and repo control prior to commit on local development machines.

  • Precommit is already installed by the make dev command (which itself calls pip install -e .[dev])
  • The pre-commit script will then run on your system upon git commit
  • See this project's .pre-commit-config.yaml for details

2.3.2 Github Actions

We use Github Actions aka Github Workflows to run:

  1. A suite of automated tests for commits received at the origin (i.e. GitHub)
  2. Publishing to PyPi upon creating a GH Release
  • See Makefile for the CLI commands that are issued
  • See .github/workflows/* for workflow details

2.3.3 Git LFS

We use Git LFS to store any large files alongside the repo. This can be useful to replicate exact environments during development and/or for automated tests

  • This requires a local machine install (see Getting Started)
  • See .gitattributes for details

2.4 Configs for Local Development

Some notes to help configure local development environment

2.4.1 Git config ~/.gitconfig

[user]
    name = <YOUR NAME>
    email = <YOUR EMAIL ADDRESS>

2.5 Install VSCode IDE

We strongly recommend using VSCode for all development on local machines, and this is a hard pre-requisite to use the .devcontainer environment (see section 3)

This repo includes relevant lightweight project control and config in:

oreum_core.code-workspace
.vscode/extensions.json
.vscode/settings.json

2.6 Publishing to PyPi

A note for maintainers (Oreum Industries only), publishing to pypi, ensure local dev machine presence of the following in a config file ~/.pypirc

[distutils]
index-servers =
   pypi
   testpypi

[pypi]
repository = https://upload.pypi.org/legacy/
username = __token__

[testpypi]
repository = https://test.pypi.org/legacy/
username = __token__

3. Code Standards

Even when writing R&D code, we strive to meet and exceed (even define) best practices for code quality, documentation and reproducibility for modern data science projects.

3.1 Code Linting & Repo Control

We use a suite of automated tools to check and enforce code quality. We indicate the relevant shields at the top of this README. See section 1.4 above for how this is enforced at precommit on developer machines and upon PR at the origin as part of our CI process, prior to master branch merge.

These include:

  • ruff - extremely fast standardised linting and formatting, which replaces black, flake8, isort
  • interrogate - ensure complete Python docstrings
  • bandit - test for common Python security issues

We also run a suite of general tests pre-packaged in precommit.


Copyright 2025 Oreum FZCO t/a Oreum Industries. All rights reserved. Oreum FZCO, IFZA, Dubai Silicon Oasis, Dubai, UAE, reg. 25515 oreum.io


Oreum Industries © 2025

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oreum_core-0.11.11.tar.gz (158.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oreum_core-0.11.11-py3-none-any.whl (88.7 kB view details)

Uploaded Python 3

File details

Details for the file oreum_core-0.11.11.tar.gz.

File metadata

  • Download URL: oreum_core-0.11.11.tar.gz
  • Upload date:
  • Size: 158.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.5

File hashes

Hashes for oreum_core-0.11.11.tar.gz
Algorithm Hash digest
SHA256 891544818a74d3a9d5e70754888d9b89664e3e7723a46fe5e87d4e37d223f9f3
MD5 dd793f78e27537ecc0531a612d973a40
BLAKE2b-256 2f7d06b45b2233e1e60ec36214ed2e45d757f5b8ae35b42d78ead7c15c97be98

See more details on using hashes here.

File details

Details for the file oreum_core-0.11.11-py3-none-any.whl.

File metadata

  • Download URL: oreum_core-0.11.11-py3-none-any.whl
  • Upload date:
  • Size: 88.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.5

File hashes

Hashes for oreum_core-0.11.11-py3-none-any.whl
Algorithm Hash digest
SHA256 a1544eb58f7b23717ad971d02ec9c46abcb08a04a956ef67a50d74c853599efb
MD5 22c55e7fab0b2860de3ead4b9d0c468c
BLAKE2b-256 77d539ffb3eb358e27d35bbb010cecfbef33715ddea3c073911d6e8d5ad68dd8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page