Utils to process the Regime Shifts DataBase CSV and parquet files
Project description
RSDB utils
Utils to process the Regime Shifts DataBase CSV and parquet files.
Functionalities:
- Import the database from a CSV or parquet file in a pandas DataFrame
- Save the database in a CSV or parquet file from a pandas DataFrame
- Check the database (a DataFrame) with the json schema or the Regime Shifts DataBase
- From the JSON schema, generate a CSV with the list of enums values (
tests/list_of_enums_from_json_schema
)
Code repository: https://github.com/regimeshifts/rsdb-utils
Licence: GNU General Public License v3 (GPLv3)
Installation
From Python
In a terminal:
pip install rsdb-utils
From R
In the R terminal:
# use reticulate
library(reticulate)
# install the rsdb-utils library
reticulate::py_install("rsdb-utils")
Usage
Read a database file
Read a database file to import it in a pandas DataFrame:
Python
from rsdb_utils import read_rsdb
df = read_rsdb("my_database.csv")
R
library(reticulate)
rsdb_utils <- import("rsdb_utils")
rsdb_utils$read_rsdb("my_database.csv")
Read and write a database file
Read a database file in parquet, import it in a pandas DataFrame, and save it in CSV
Python
from rsdb_utils import read_rsdb, write_rsdb
df = read_rsdb("my_database.parquet")
write_rsdb(df, "my_database.csv")
R
library(reticulate)
rsdb_utils <- import("rsdb_utils")
df <- read_rsdb$read_rsdb("my_database.csv")
read_rsdb$write_rsdb(df, "my_database.csv")
Check a database
Check a database based on the JSON schema of the Regime Shifts Database:
Python
from rsdb_utils import read_rsdb, write_rsdb, check_rsdb
df = read_rsdb("my_database.parquet")
df = check_rsdb(df)
# save the database with the errors info
write_rsdb(df, "my_database.csv")
R
library(reticulate)
rsdb_utils <- import("rsdb_utils")
df <- read_rsdb$read_rsdb("my_database.csv")
df <- read_rsdb$check_rsdb("my_database.csv")
read_rsdb$write_rsdb(df, "my_database.csv")
Generate the enums lists
Python
from rsdb_utils import generate_enums_dataframe
df = generate_enums_dataframe()
df.to_csv("list_of_enums_from_json_schema.csv")
Developer section
Python local installation
It's advised to create a virtual environment to install the required packages during development. You can then "install" this package (rsdb-utils) as a local package to make it easier to import within this local virtual environment.
# create a virtual environment
python3 -m venv .venv
# load the virtual environment
source .venv/bin/activate
# install the requirements
pip install -r requirements.txt
# install rsdb-utils as a local package
pip install -e .
R local installation
You can install the package from source with reticulate in R.
Use the instructions above in Installation from R
and swap the line
reticulate::virtualenv_install("rsdb-utils")
by:
reticulate::virtualenv_install("rsdb-utils", "/path/to/source/rsdb-utils")
Note that the editable installation ('pip install -e') doesn't work with reticulate, which means that you will need to reinstall the package each time you will modify the Python source code.
Tests
The tests are run in a GitHub action everytime a commit is pushed.
The tests are run with pytest
, install it (within the virtual environment) with:
pip install pytest
And run the tests:
pytest tests/tests.py
Build
The building and publishing are managed by a GitHub action for each new release.
Carefully check the version number in pyproject.toml
before creating the new tag and the new release.
Warning: once published on PyPi, a version can't be re-uploaded.
Local build
In the development virtual environment, install:
pip install build twine
Then build the package:
python3 -m build
Configure your pypi access token and publish the package version:
twine upload dist/rsdb_utils-0.1.tar.gz dist/rsdb_utils-0.1-py3-none-any.whl
How to update the JSON schema
To pull the last schema in the repository, use the following command:
git submodule update --recursive --remote
You can then create a new version of rsdb-utils
: edit the version number in pyproject.toml
, commit, push and publish
a new release of rsdb-utils
.
Romain THOMAS 2024
Stockholm Resilience Centre
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rsdb_utils-0.5.tar.gz
.
File metadata
- Download URL: rsdb_utils-0.5.tar.gz
- Upload date:
- Size: 52.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3666ef64dfa9dac752a13469a602f41ea1c3e99cc45ca9e9c97adf93dd73393 |
|
MD5 | 8f16d0c7d4cda7a8fc4be78c9c9146fa |
|
BLAKE2b-256 | b7ac54ac1761040ba9512822f019d94439b60b16537591782739b6b303e4d4c9 |
File details
Details for the file rsdb_utils-0.5-py3-none-any.whl
.
File metadata
- Download URL: rsdb_utils-0.5-py3-none-any.whl
- Upload date:
- Size: 37.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c84b643b4efaaf43503aa31db4ebffea6ca59cac44df065454aae1df872fa636 |
|
MD5 | 3c2763cb0337a2d9dbaaed85f3821cc9 |
|
BLAKE2b-256 | e21bfee185f65cd02edae070dddd40fc0d85aa89e12e6624d4e48b15abce0e58 |