Data preprocessing & augmentation framework, designed for working with crowd counting datasets and ML/DL framework-independent. Supports multitude of simple as well as advanced transformations, outputs and loaders, all of them to be combined using pipelines.
Project description
CCAugmentation
Data preprocessing and augmentation framework that is designed for working with crowd counting datasets. It supports multitude of simple as well as advanced transformations and is based on pipelines which allow a flexible flow of data between loaders, transformations and outputs. Deep learning framework-independent, though works best with PyTorch.
Current capabilities
Each data preprocessing procedure is defined in form of a pipeline that consists of a data loader and a list of operations to sequentially execute on the data. Each of the operations may be of the following types:
- Transformation - Returns transformed data on output, does not have side effects
- Output - Returns unmodified data on output, has side effects that, for example, write data to files
- Operation - Performs any other functions, not qualifying for any of the aforementioned types
Available transformations are:
- Crop
- Scale
- Downscale
- Rotate
- StandardizeSize
- Normalize
- NormalizeDensityMap
- FlipLR
- ToGrayscale
- LambdaTransformation
- Cutout
Available outputs are:
- Demonstrate
- SaveImagesToFiles
- SaveImagesToBinaryFile
- SaveDensityMapsToCSVFiles
- SaveDensityMapsToBinaryFile
Available operations are:
- Duplicate
- Dropout
- RandomArgs
- OptimizeBatch
Available loaders are:
- BasicImageFileLoader
- ImageFileLoader
- BasicGTPointsMatFileLoader
- GTPointsMatFileLoader
- BasicDensityMapCSVFileLoader
- DensityMapCSVFileLoader
- VariableLoader
- ConcatenatingLoader
- CombinedLoader
For more information about the specific topics, please refer to the related comments in the code.
How to use
Loading the data from ShanghaiTech dataset and taking crops with 1/4 size:
import CCAugmentation as cca
import CCAugmentation as ccat
train_data_pipeline = cca.Pipeline(
cca.examples.loading.SHHLoader("/data/ShanghaiTech/", "train", "B"),
[
ccat.Crop(None, None, 1/4, 1/4)
]
)
train_img, train_dm = train_data_pipeline.execute_collect()
# you can also use execute_generate() to create a generator
print(len(train_img), len(train_dm))
To see more examples of usage, please see our experiment environment repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ccaugmentation-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83bcda4455d3dffdb805d0eb891c4b41f349aa02d7acc48441a247a615af8185 |
|
MD5 | 06a7b656b5df9f2fc8ba8801093cc085 |
|
BLAKE2b-256 | 2bc884ee56cb4327c3434fb71eaca9c34d3ae7f0ceb1866ad66730d4e3477dfa |