A novel and simple framework based on prevalent DL frameworks and other image processing libs. v0.4.19: mute redundant INFO in multi-process.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Nebulae Brochure

A novel and simple framework based on concurrent mainstream frameworks and other image processing libraries. It is convenient to deploy almost every module independently.

Modules Overview

Fuel: easily manage and read dataset you need anytime

Toolkit: includes many utilities for better support of nebulae

Fuel

FuelGenerator()

Build a FuelGenerator to spatial efficently store data.

config: [dict] A dictionary containing all parameters.
file_dir: [str] Where your raw data is.
file_list: [str] A csv file in which all the raw datum file name and labels are listed.
dtype: [list of str] A list of data types of all columns but the first one in file_list. Valid data types are 'uint8', 'uint16', 'uint32', 'int8', 'int16', 'int32', 'int64', 'float16', 'float32', 'float64', 'str'. Plus, if you add a 'v' as initial character e.g. 'vuint8', the data of each row in this column is allowed to be saved in variable length.
is_seq: [bool] If it is data sequence e.g. video frames. Defaults to false.

An example of file_list.csv is as follow. 'image' and 'label' are the key names of data and labels respectively. Note that the image name is a path relative to file_dir.

image	label
img_1.jpg	2
img_2.jpg	0
...	...
img_100.jpg	5

FuelGenerator.generate(dst_path, height, width, channel=3, encode='JPEG', shards=1, keep_exif=True)

dst_path: [str] A hdf5/npz file where you want to save the data.
height: [int] range between (0, +∞). The height of image data.
width: [int] range between (0, +∞). The height of image data.
channel: [int] The height of image data. Defaults to 3.
encode: [str] The mean by which image data is encoded. Valid encoders are 'jpeg' and 'png'. 'PNG' is the way without information loss. Defaults to 'JPEG'.
shards: [int] How many files you need to split the data into. Defaults to 1.
keep_exif: [bool] Whether to keep EXIF information of photos. Defaults to true.

import nebulae
# create a data generator
fg = nebulae.fuel.Generator(file_dir='/home/file_dir',
                            file_list='file_list.csv',
                            dtype=['vuint8', 'int8'])
# generate compressed data file
fg.generate(dst_path='/home/data/fuel.hdf5', 
            channel=3,
            height=224,
            width=224)

FuelGenerator.modify(config=None)

You can edit properties again for generating other file.

fg.modify(height=200, width=200)

Passing a dictionary of changed parameters is equivalent.

config = {'height': 200, 'width': 200}
fg.modify(config=config)

FuelDepot()

Build a Fuel Depot that allows you to deposit datasets.

import nebulae
# create a data depot
fd = nebulae.fuel.FuelDepot()

FuelDepot.load(config, name, batch_size, data_path, data_key, height=0, width=0, channel, frame, is_encoded=True, if_shuffle=True, rescale=True, resol_ratio=1, complete_last_batch=True, spatial_aug='', p_sa=(0), theta_sa=(0), temporal_aug='', p_ta=(0), theta_ta=(0))

Mount dataset on your FuelDepot.

name: [str] Name of your dataset.
batch_size: [int] The size of a mini-batch.
data_path: [str] The full path of your data file. It must be a hdf5/npz file.
data_key: [str] The key name of data.
if_shuffle: [bool] Whether to shuffle data samples every epoch. Defaults to True.
is_encoded: [bool] If the stored data has been compressed. Defaults to True.
channel: [int] The height of image data. Defaults to 3.
height: [int] range between (0, +∞). Height of image data. Defaults to 0.
width: [int] range between (0, +∞). Width of image data. Defaults to 0.
frame: [int] range between [-1, +∞). The unified number of frames for sequential data. Defaults to 0.
rescale: [bool] Whether to rescale values of fetched data to [-1, 1]. Default to True.
resol_ratio: [float] range between (0, 1] The coefficient of subsampling for lowering image data resolution. Set it as 0.5 to carry out 1/2 subsampling. Defaults to 1.
complete_last_batch: [bool] Whether to complete the last batch so that it has samples as many as other batches. Defaults to True.
spatial_aug: [comma-separated str] Put spatial data augmentations you want in a string with comma as separator. Valid augmentations include 'flip', 'brightness', 'gamma_contrast' and 'log_contrast', e.g. 'flip,brightness'. Defaults to '' which means no augmentation.
p_sa: [tuple of float] range between [0, 1]. The probabilities of taking spatial data augmentations according to the order in spatial_aug. Defaults to (0).
theta_sa: [tuple] The parameters of spatial data augmentations according to the order in spatial_aug. Defaults to (0).
temporal_aug: [comma-separated str] Put temporal data augmentations you want in a string with comma as separator. Valid augmentations include 'sample', e.g. 'sample'. Make sure to set is_seq as True if you want to enable temporal augmentation. Defaults to '' which means no augmentation.
p_ta: [tuple of float] range between [0, 1]. The probabilities of taking temporal data augmentations according to the order in temporal_aug. Defaults to (0).
theta_ta: [tuple] The parameters of temporal data augmentations according to the order in temporal_aug. Defaults to (0).

All data augmentation approaches are listed as follows:

Data Source	Augmentation	Parameters
Image	flip	empty tuple: ()
	crop	nested tuple of float: ((minimum area ratio, maximum area ratio), (minimum aspect ratio, maximum aspect ratio)) of cropped area, where aspect ratio is width/height
	brightness	float, range between (0, 1]: increment/decrement factor on brightness
	gamma_contrast	float, range between (0, 1]: expansion/shrinkage factor on pixel value domain
	log_contrast	float, range between (0, 1]: expansion/shrinkage factor on pixel value domain
Sequence	sampling	positive int, denoted as theta: sample an image every theta frames

fd.load(name='test-img',
        batch_size=4,
        data_key='image',
        data_path='/home/image.hdf5',
        width=200, height=200,
        resol_ratio=0.5,
        spatial_aug='brightness,gamma_contrast',
        p_sa=(0.5, 0.5), theta_sa=(0.2, 1.2))

FuelDepot.modify(tank, config=None)

tank: [str] Specify the dataset to modify.

You can edit properties to change the way you fetch batch and process data.

fd.modify(tank='test-img', name='test', batch_size=2)

Passing a dictionary of changed parameters is equivalent.

config = {'name':'test', 'batch_size':2}
fd.modify(tank='test-img', config=config)

FuelDepot.unload(tank='')

tank: [str] Specify the dataset to unmount. Defaults to '' in which case all datasets are going to get unmounted.

Unmount dataset that is no longer necessary.

FuelDepot.next(tank)

tank: [str] Specify the dataset from which data is fetched.

Return a dictionary containing a batch of data, labels and other information.

FuelDepot.epoch

Attribute: a dictionary containing current epoch of each dataset. Epoch starts from 1.

FuelDepot.MPE

Attribute: a dictionary containing how many iterations there are within an epoch for each dataset.

FuelDepot.volume

Attribute: a dictionary containing the number of datum in each dataset.

Astrobase

Component()

Build a component house in which users can make use of varieties of components and create new one by packing some of them up, or just from nothing.

OffTheShelf()

Set up a framework within which users can build modules using core backend. It is convenient especially when you want to fork open-sourced codes into nebulae or when you find it difficult to implement a desired function.

import nebulae
import torch
# designate pytorch as core backend
nebulae.Law.CORE = 'pytorch'
# set up a framework
OTS = nebulae.astrobase.OffTheShelf()
# create your own component
class DecisionLayer(OTS):
    def __init__(self, feat_dim, nclass, **kwargs):
        super(DecisionLayer, self).__init__(**kwargs)
        self.feat_dim = feat_dim
        self.linear = torch.nn.Linear(feat_dim, nclass)

    def run(self, x):
        x = x.reshape(-1, self.feat_dim)
        y = self.linear(x)
        return y

COMP = nebulae.astrobase.Component()
# add DecisionLayer to component house
COMP.new('dsl', DecisionLayer, 'x', out_shape=(-1, 128))

N.B. Make sure that '_' is not the initial or rear letter of your argument names.

SpaceDock()

Attribute: a dictionary containing the number of datum in each dataset.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.6.21

Sep 18, 2023

0.6.20

Aug 20, 2023

0.6.19

Aug 6, 2023

0.6.18

Aug 3, 2023

0.6.17

Jul 30, 2023

0.6.16

Jul 13, 2023

0.6.15

Jul 5, 2023

0.6.14

Jun 24, 2023

0.6.13

May 24, 2023

0.6.12

May 17, 2023

0.6.11

May 16, 2023

0.6.10

May 15, 2023

0.6.9

May 14, 2023

0.6.7

May 4, 2023

0.6.6

Apr 29, 2023

0.6.5

Apr 29, 2023

0.6.4

Feb 28, 2023

0.6.3

Jan 17, 2023

0.6.2

Dec 17, 2022

0.6.1

Sep 9, 2022

0.6.0

Sep 5, 2022

0.5.38

Jul 22, 2022

0.5.37

Jul 17, 2022

0.5.36

Jun 17, 2022

0.5.35

Jun 10, 2022

0.5.34

Jun 7, 2022

0.5.33

Jun 7, 2022

0.5.32

Jun 7, 2022

0.5.31

Jun 6, 2022

0.5.30

May 21, 2022

0.5.29

May 19, 2022

0.5.28

May 17, 2022

0.5.27

May 16, 2022

0.5.26

May 10, 2022

0.5.25

Apr 6, 2022

0.5.24

Apr 4, 2022

0.5.23

Mar 14, 2022

0.5.22

Mar 11, 2022

0.5.21

Mar 10, 2022

0.5.20

Mar 10, 2022

0.5.19

Mar 9, 2022

0.5.18

Mar 8, 2022

0.5.17

Feb 25, 2022

0.5.16

Feb 21, 2022

0.5.15

Jan 29, 2022

0.5.14

Jan 24, 2022

0.5.13

Dec 26, 2021

0.5.12

Dec 19, 2021

0.5.11

Dec 16, 2021

0.5.10

Oct 20, 2021

0.5.9

Sep 25, 2021

0.5.8

Sep 16, 2021

0.5.7

Sep 15, 2021

0.5.6

Sep 14, 2021

0.5.5

Aug 10, 2021

0.5.4

Aug 2, 2021

0.5.2

May 27, 2021

0.5.1

Feb 9, 2021

0.5.0

Jan 27, 2021

0.4.20

Jan 27, 2021

This version

0.4.19

Jan 26, 2021

0.4.18

Jan 22, 2021

0.4.17

Jan 22, 2021

0.4.15

Dec 14, 2020

0.4.14

Dec 8, 2020

0.4.13

Dec 2, 2020

0.4.12

Nov 30, 2020

0.4.11

Nov 28, 2020

0.4.9

Nov 25, 2020

0.4.8

Nov 25, 2020

0.4.7

Nov 23, 2020

0.4.6

Nov 23, 2020

0.4.5

Nov 18, 2020

0.4.4

Nov 17, 2020

0.4.3

Oct 26, 2020

0.4.1

Oct 22, 2020

0.4.0

Oct 21, 2020

0.3.2

Jul 21, 2020

0.3.1

Feb 29, 2020

0.3.0

Nov 20, 2019

0.2.5

Oct 28, 2019

0.2.4

Oct 24, 2019

0.2.3

Oct 24, 2019

0.2.2

Oct 22, 2019

0.2.1

Oct 22, 2019

0.2.0

Oct 21, 2019

0.1.21

Mar 28, 2019

0.1.20

Mar 26, 2019

0.1.19

Mar 25, 2019

0.1.18

Mar 22, 2019

0.1.17

Mar 20, 2019

0.1.16

Mar 12, 2019

0.1.15

Mar 12, 2019

0.1.14

Mar 7, 2019

0.1.13

Mar 6, 2019

0.1.12

Feb 19, 2019

0.1.11

Feb 12, 2019

0.1.10

Jan 28, 2019

0.1.9

Jan 15, 2019

0.1.8

Jan 15, 2019

0.1.7

Jan 15, 2019

0.1.6

Jan 14, 2019

0.1.5

Jan 8, 2019

0.1.4

Jan 8, 2019

0.1.1

Jan 7, 2019

0.1.0

Jan 6, 2019

0.0.17

Dec 25, 2018

0.0.16

Dec 21, 2018

0.0.14

Dec 20, 2018

0.0.13

Dec 20, 2018

0.0.12

Dec 16, 2018

0.0.10

Dec 14, 2018

0.0.4

Dec 11, 2018

0.0.1

Nov 8, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nebulae-0.4.19.tar.gz (53.0 kB view hashes)

Uploaded Jan 26, 2021 Source

Built Distribution

nebulae-0.4.19-py3-none-any.whl (188.8 kB view hashes)

Uploaded Jan 26, 2021 Python 3

Hashes for nebulae-0.4.19.tar.gz

Hashes for nebulae-0.4.19.tar.gz
Algorithm	Hash digest
SHA256	`2989a9911f1768df94f80154e2771883044fd924996236b2aad237f2b721c903`
MD5	`6cf10b03daab982475cf67c7a4bb3ddf`
BLAKE2b-256	`f4fb51c3daa468aabb1f7c738fff4c54399155b22475f95a796ae29146990224`

Hashes for nebulae-0.4.19-py3-none-any.whl

Hashes for nebulae-0.4.19-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6b056ea591d0adc3119787f53a13e484743e8dc5a703d894e58fb3f72a364a93`
MD5	`3aeb8cfa960639f39b829a4b2a50a5e9`
BLAKE2b-256	`6f74cf03b581a5d2385e0de62cb99124fe0507c5bb43ae1facc754509db1639b`