Skip to main content

Pouya's Python routines. A collection of useful Python routines for everyday and professional life.

Project description

Popyrous

(Pouya's Python Routines) A collection of useful and frequently encountered Python routines for (data) science, research, development, and everyday life.

Author: Pouya P. Niaz (pniaz20@ku.edu.tr , pouya.p.niaz@gmail.com)
Version: 0.0.9
Last Update: July 16, 2023

This is a collection of Python routines for the following purposes:

  • Checking for packages and installing missing ones iwithin scripts without the need for Jupyter and symbols like "!" and "%".
  • Reading and writing .mat files coming to/from MathWorks MATLAB software.
  • Building and manipulating time series data using sliding windows, low-pass filtering, etc.
  • Building flexible and easy-to-use datasets for data analysis or machine learning out of structured time series experiments (multiple subjects, conditions, repetitions, etc.).
  • Downloading data/files from the internet and Google Drive, using simple functions.
  • Compressing or extracting Zip files with LZMA, etc., using simple functions.

Install with:

pip install popyrous

1- Intro

This package is a collection of routines I have widely used in my scientific, academic and engineering life. It holds functionality for data and file manipulation, some tools for manipulating time series data, some tools for extracting machine-learning-ready time series datasets from tabular timeseries data of structured experiments, i.e., experiments performed with multiple subjects, under multiple conditions, with many repetitions, and so forth.

The contents and applications of this package are described briefly below. However, extensive documentation is provided in the docstrings of all functions and classes in the code, which is where you should look for further information.


2- Contents and Submodules

2-1- matlab

This submodule contains functions for reading and writing data to and from .mat files.

  • type_compatible(typ): Determining whether or not a Python data type is compatible for writing into .mat files.
  • save_workspace(filename, masterdict): Save dictionary holding variables and data into .mat file.
  • load_workspace(filename, dictname): Load contents of .mat file into an (existing or new) dictionary.

2-2- packages

This submodule contains functions for checking which packages are installed in the environment without having to be in a notebook and running commands with ! or %. Also, you can check for a list of required packages (with or without required versions) and install missing packages, or wrong-versioned packages at the same time.

  • get_package_list(): Get list (dictionary with keys being packages and values being versions) of packages in the (conda) environment.
  • check_packages(pkglst, install_missing, **kwargs): Get a list of required packages and see if they are all installed, installing the missing ones in the process.

Example:

from popyrous.packages import check_packages
check_packages(["numpy","scipy","pandas==1.5.2"], install_missing=True, reinstall_wrong_versions=True)

2-3- timeseries

This submodule contains some classes and functions for working easily and efficiently with time series data. You can filter data, pass it through sliding window and extract data for machine/deep learning, etc. Also, given the dataframe of a tructured time series experiment where multiple subjects repeated an experiment multiple times under various conditions, you can get their data, preprocess, post-process, filter, extract sliding window, etc. and then keep some subjects, conditions, or trials for training and the rest for testing (for data analysis or machine learning), and so forth.

2-3-1- sliding_window

The sliding_window function gets tabular timeseries data, extracts sliding windows from it, then downsamples or inverts them, etc. then returns them. Sliding windows of time series data is used for time series modeling, prediction, classification, regression and forecasting problems.

2-3-2- datasets

  • TabularDataset: A class for reading time series data from an array, downsampling, preprocessing, and extracting sliding windows from it.
  • make_squeezed_dataset(hparams, inputs, outputs, **kwargs): Gets inputs/outputs, returns squeezed (2D) sliding window dataset ready to be fed to, e.g., an ANN model.
  • make_unsqueezed_dataset(hparams, inputs, outputs, **kwargs): Gets inputs/outputs, returns unsqueezed (3D) sliding window dataset ready to be fed to, e.g., an LSTM model.

2-3-3- experiment

  • TimeseriesExperiment: A class that gets a single dataframe containing the time series data of a series of structured experiments where there are multiple subjects, repetitions and trials. The data can then be processed such that data of each trial is separated and processed individually, some subjects, conditions or trials are kept for training/testing, there is preprocessing before extracting sliding windows, and postprocessing after it, and so on. This class comes in handy when the data of such a structured series of experiments needs to be processed and fed to a machine learning model, for instance.
  • generate_cell_array: A function, which is a more concise version of the above class, doing everything in one shot and returning everything together.

2-3-4- filt

Some functions for low-pass filtering time series data.

  • butter_lowpass_filter_forward filters input data with a digital Butterworth low-pass filter gvien sampling and cutoff frequncies, and filter order. This filter is causal, and only goes forward in time. It does not see its future. It is used for real-time implementations. Because this filter is causal, it induces a phase shift, so the filtered signal will have a delay relative to the real signal. The lower the cutoff frequency, the longer the delay.
  • butter_lowpass_filter_back_to_back filters input data similarly, but uses filtfilt to go back to back, so it looks both to past and future. It can only smooth the data offline, since it has access to the future as well. Unlike the previous causal filter, it has no phase shift.

2-3-5- metrics

Some metrics used for time-series classification, etc.

  • tsc_metrics: Time-series classification metrics, including accuracy, f1 score, concurrency (transitioning on time) and consistency (not changing prediction in consistent non-transitioning portions of the data)

2-3-6- cwt

Continmuous Wavelet Transform

  • cwt_for_batch: gets a numpy array of shape, e.g., (batchsize, channels, seqlen) [could be any shape, as long as time is the last dimension] and returns an array of its CWT coefficients. Additionally, it can downsample it and remove the last row and column. Returns a (batchsize, channels, coefs, seqlen) dataset of 2D images.
  • cwt_for_tensor: gets a data tensor of any shape and simply performs CWT on it. Takes the last dimension as time, and adds a dimension to the beginning, containing coefficients.

2-4- web

This submodule contains some web-related functions for downloading files from the internet or Google Drive, storing them, reading their contents, etc.

  • download_google_drive_file(shareable_link, output_file): Gets shareable link of a Google Drive file, and downlaods it.
  • download(url, filename, **kwargs): Downlaods material from the internet, and reads its content or stores in a file.

2-5- zipfiles

This submodule contains some functions for compressing/decompressing zip files.

  • extract_files(fileName): Extracts everything in the zip file.
  • compress_files(file_name, **kwargs): Compresses files into a zip file. Options for compression method, etc. are provided.

2-6- ml

This submodule contains some machine-learning-related code. For now, it just contains a function for pretty plotting confusion matrices (see credits).

  • make_confusion_matrix gets a confusion matrix and some parameters, and pretty plots it.

3- License

This package is built with MIT license.


4- Credits

Pretty plotting confusion matrix:
Dennis T
https://github.com/DTrimarchi10/confusion_matrix
https://medium.com/@dtuk81/confusion-matrix-visualization-fc31e3f30fea

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

popyrous-0.0.9.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

popyrous-0.0.9-py3-none-any.whl (45.3 kB view details)

Uploaded Python 3

File details

Details for the file popyrous-0.0.9.tar.gz.

File metadata

  • Download URL: popyrous-0.0.9.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for popyrous-0.0.9.tar.gz
Algorithm Hash digest
SHA256 fbefa980ef19e8cb52d75a8c3dcb3b2cae3e5c4e5d057b41525fa61531907266
MD5 7321c5653bff8d963bfc0d135dea602f
BLAKE2b-256 5c4dac8c6c46cd97649cc05f5158bba04c92e5bb9cc4274fdc429f7db0657901

See more details on using hashes here.

Provenance

File details

Details for the file popyrous-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: popyrous-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 45.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for popyrous-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 945be9d4904603550ec9a709276c1f1d87f9696a73a4dfaca25097fe0dee8591
MD5 a1f618aecea4d98adbcd907a2849826e
BLAKE2b-256 77b28dd644455fe9e955b5a5b55e42fbe24636db3a2b2de675ec8a8e76f43989

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page