Skip to main content

This library splits large dataframes into smaller chunks, which are then passed to multiprocessing

Project description

#MultiProcessDivision()

This package is an optimized approach to split large pd.DataFrame() or pd.Series() objects for optimized multiprocessing processing. The core aim is to provide a stable interface, which allows splitting vectorized objects along specified axes.
The class only consists of a single function, divide_df(), which requires the following parameters to work seamlessly:

  • The data provided for splitting. Data must be provided as pd.DataFrame() or pd.Series() objects.
  • The axis along which a split is to be conducted. While this parameter can either be set 0 (index) or 1 (columns) for pd.DataFrame() objects, it has to be set to 0 for pd.Series() objects.
  • In case a pd.Series() object is passed to the function, the "series" parameter has to be set to True.
  • The following "range_setter" parameter is optional. If it is not set, it defaults to None, which is the amount of logical cores, the executing system provides. Otherwise, if the amount of cores to execute the processing on is limited, the parameter must be set with a value smaller than the amount of logical cores.

##Questions and Feedback Please don't hesitate to provide me feedback, if you use the function in your stack. Improvements are warmly welcome.

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

MultiProcessDivision-1.0.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

MultiProcessDivision-1.0-py3-none-any.whl (3.6 kB view details)

Uploaded Python 3

File details

Details for the file MultiProcessDivision-1.0.tar.gz.

File metadata

  • Download URL: MultiProcessDivision-1.0.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.12

File hashes

Hashes for MultiProcessDivision-1.0.tar.gz
Algorithm Hash digest
SHA256 d7d59bcb917ddf68f07cf0162105407c132a35f4b8fecc0e44f436a67169c6f0
MD5 e09c3c5fd80daac9bae56b93eedfd156
BLAKE2b-256 a6cd3c07958ec75692a490dc1e9395a4a985f4a60d9d7d21fb6e71f6aa310959

See more details on using hashes here.

File details

Details for the file MultiProcessDivision-1.0-py3-none-any.whl.

File metadata

  • Download URL: MultiProcessDivision-1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.12

File hashes

Hashes for MultiProcessDivision-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4c9cfbe5f4b61d8dece9ba2fd81c7b115f93a3ce4afb5f7c7e315c5d0e04113
MD5 329260b89f1d5a20c84b932c90d153f4
BLAKE2b-256 b8980056b26990845f35ef4f67a45ebd85017a5ad6db090b55a6300e560cec47

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page