Skip to main content

Utils to process the Regime Shifts DataBase CSV and parquet files

Project description

RSDB utils

Utils to process the Regime Shifts DataBase CSV and parquet files.

Functionalities:

  • Import the database from a CSV or parquet file in a pandas DataFrame
  • Save the database in a CSV or parquet file from a pandas DataFrame
  • Check the database (a DataFrame) with the json schema or the Regime Shifts DataBase
  • From the JSON schema, generate a CSV with the list of enums values (tests/list_of_enums_from_json_schema)

Code repository: https://github.com/regimeshifts/rsdb-utils

Licence: GNU General Public License v3 (GPLv3)

Installation

From Python

In a terminal:

pip install rsdb-utils 

From R

In the R terminal:

# use reticulate
library(reticulate)
# install the rsdb-utils library
reticulate::py_install("rsdb-utils")

Usage

Read a database file

Read a database file to import it in a pandas DataFrame:

Python

from rsdb_utils import read_rsdb

df = read_rsdb("my_database.csv")

R

library(reticulate)
rsdb_utils <- import("rsdb_utils")

rsdb_utils$read_rsdb("my_database.csv")

Read and write a database file

Read a database file in parquet, import it in a pandas DataFrame, and save it in CSV

Python

from rsdb_utils import read_rsdb, write_rsdb

df = read_rsdb("my_database.parquet")

write_rsdb(df, "my_database.csv")

R

library(reticulate)
rsdb_utils <- import("rsdb_utils")

df <- read_rsdb$read_rsdb("my_database.csv")

read_rsdb$write_rsdb(df, "my_database.csv")

Check a database

Check a database based on the JSON schema of the Regime Shifts Database:

Python

from rsdb_utils import read_rsdb, write_rsdb, check_rsdb

df = read_rsdb("my_database.parquet")

df = check_rsdb(df)

# save the database with the errors info
write_rsdb(df, "my_database.csv")

R

library(reticulate)
rsdb_utils <- import("rsdb_utils")

df <- read_rsdb$read_rsdb("my_database.csv")

df <- read_rsdb$check_rsdb("my_database.csv")

read_rsdb$write_rsdb(df, "my_database.csv")

Generate the enums lists

Python

from rsdb_utils import generate_enums_dataframe

df = generate_enums_dataframe()
df.to_csv("list_of_enums_from_json_schema.csv")

Developer section

Python local installation

It's advised to create a virtual environment to install the required packages during development. You can then "install" this package (rsdb-utils) as a local package to make it easier to import within this local virtual environment.

# create a virtual environment
python3 -m venv .venv
# load the virtual environment
source .venv/bin/activate
# install the requirements
pip install -r requirements.txt
# install rsdb-utils as a local package
pip install -e .

R local installation

You can install the package from source with reticulate in R. Use the instructions above in Installation from R and swap the line reticulate::virtualenv_install("rsdb-utils") by:

reticulate::virtualenv_install("rsdb-utils", "/path/to/source/rsdb-utils")

Note that the editable installation ('pip install -e') doesn't work with reticulate, which means that you will need to reinstall the package each time you will modify the Python source code.

Tests

The tests are run in a GitHub action everytime a commit is pushed.

The tests are run with pytest, install it (within the virtual environment) with:

pip install pytest 

And run the tests:

pytest tests/tests.py

Build

The building and publishing are managed by a GitHub action for each new release. Carefully check the version number in pyproject.toml before creating the new tag and the new release.

Warning: once published on PyPi, a version can't be re-uploaded.

Local build

In the development virtual environment, install:

pip install build twine

Then build the package:

python3 -m build

Configure your pypi access token and publish the package version:

twine upload dist/rsdb_utils-0.1.tar.gz dist/rsdb_utils-0.1-py3-none-any.whl

How to update the JSON schema

To pull the last schema in the repository, use the following command:

git submodule update --recursive --remote

You can then create a new version of rsdb-utils: edit the version number in pyproject.toml, commit, push and publish a new release of rsdb-utils.

Romain THOMAS 2024
Stockholm Resilience Centre

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rsdb_utils-0.5.tar.gz (52.2 kB view details)

Uploaded Source

Built Distribution

rsdb_utils-0.5-py3-none-any.whl (37.9 kB view details)

Uploaded Python 3

File details

Details for the file rsdb_utils-0.5.tar.gz.

File metadata

  • Download URL: rsdb_utils-0.5.tar.gz
  • Upload date:
  • Size: 52.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for rsdb_utils-0.5.tar.gz
Algorithm Hash digest
SHA256 c3666ef64dfa9dac752a13469a602f41ea1c3e99cc45ca9e9c97adf93dd73393
MD5 8f16d0c7d4cda7a8fc4be78c9c9146fa
BLAKE2b-256 b7ac54ac1761040ba9512822f019d94439b60b16537591782739b6b303e4d4c9

See more details on using hashes here.

File details

Details for the file rsdb_utils-0.5-py3-none-any.whl.

File metadata

  • Download URL: rsdb_utils-0.5-py3-none-any.whl
  • Upload date:
  • Size: 37.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for rsdb_utils-0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c84b643b4efaaf43503aa31db4ebffea6ca59cac44df065454aae1df872fa636
MD5 3c2763cb0337a2d9dbaaed85f3821cc9
BLAKE2b-256 e21bfee185f65cd02edae070dddd40fc0d85aa89e12e6624d4e48b15abce0e58

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page