nebulae

A novel and simple framework based on prevalent DL framework and other image processing libs. v0.1.7: users can generate hdf5 as several files since generating large dataset at once is risky; In addition, mergeFuel function is provided for merging multiple hdf5 files; users can remove EXIF in images while generating data file by setting keep_exif as False.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

# nebulae
### A novel and simple framework based on tf and other image processing libs.
## Modules Overview
### Fuel: easily manage and read dataset you need anytime
### Toolkit: includes many utilities for better support of nebulae
---
## Toolkit
### Build a FuelGenerator to spatial efficently store data.
- config: <**dict**> A dictionary containing all parameters.
- file_dir: <**str**> Where your raw data is.
- file_list: <**str**> A csv file in which all the raw datum file name and labels are listed.
- dst_path: <**str**> A hdf5/npz file where you want to save the compressed data.
- dtype: <**list** of **str**> A list of data types of all columns but the first one in *file_list*. Valid data types are 'uint8', 'uint16', 'uint32', 'int8', 'int16', 'int32', 'int64', 'float16', 'float32', 'float64', 'str'.
- height: <**int**, range between **(0, +∞)**> The height of image data. Defaults to 224.
- width: <**int**, range between **(0, +∞)**> The height of image data. Defaults to 224.
- channel: <**int**> The height of image data. Defaults to 1.
- encode: <**str**> The mean by which image data is encoded. 'PNG' is the way without information loss. Defaults to 'JPEG'.
### An example of file_list.csv is as follow. 'image' and 'label' are the key names of data and labels respectively.
image|label
:----|:---:
img_1.jpg|2|
img_2.jpg|0|
...|...|
img_9.jpg|5|
```
import nebulae

fg = nebulae.toolkit.FuelGenerator(file_dir='/file_dir',
file_list='file_list.csv',
dst_path='/file_dir/dst_path.hdf5',
dtype=['uint8', 'int8'],
channel=3,
height=224,
width=224,
encode='jpeg')
```
### Call generateFuel() to generate compressed data file.
```
fg.generateFuel()
```
### You can edit properties again for generating other file.
```
fg.propertyEdit(height=200, width=200)
fg.generateFuel()
```
### Passing a dictionary of changed parameters is equivalent.
```
config = {'height': 200, 'width': 200}
fg.propertyEdit(config=config)
fg.generateFuel()
```
---
## Fuel
### Build a FuelDepot that allows you to deposit datasets.
```
fd = nebulae.fuel.FuelDepot()
```
### Call loadFuel() to mount dataset on your FuelDepot.
- name: <**str**> Name of your dataset.
- batch_size: <**int**> The size of mini-batch.
- data_path: <**str**> The full path of your data file. It must be a hdf5/npz file.
- key_data: <**str**> The key name of data.
- if_shuffle: <**bool**> Whether to shuffle data samples every epoch. Defaults to True.
- height: <**int**, range between **(0, +∞)**> Height of image data. Defaults to 0.
- width: <**int**, range between **(0, +∞)**> Width of image data. Defaults to 0.
- resol_ratio: <**float**, range between **(0, 1]**> The coefficient of subsampling for lowering image data resolution. Set it as 0.5 to carry out 1/2 subsampling. Defaults to 1.
- is_seq: <**bool**> Declare whether this dataset is sequential. Defaults to False
- spatial_aug: <comma-separated **str**> Put spatial data augmentations you want in a string with comma as separator. Valid augmentations include 'flip', 'brightness', 'gamma_contrast' and 'log_contrast', e.g. 'flip,brightness'. Defaults to '' which means no augmentation.
- p_sa: <**tuple** of **float**, range between **[0, 1]**> The probabilities of taking spatial data augmentations according to the order in *spatial_aug*. Defaults to (0).
- theta_sa: <**tuple**> The parameters of spatial data augmentations according to the order in *spatial_aug*. Defaults to (0).
- temporal_aug: <comma-separated **str**> Put temporal data augmentations you want in a string with comma as separator. Valid augmentations include 'sample', e.g. 'sample'. Make sure to set *is_seq* as True if you want to enable temporal augmentation. Defaults to '' which means no augmentation.
- p_ta: <**tuple** of **float**, range between **[0, 1]**> The probabilities of taking temporal data augmentations according to the order in *temporal_aug*. Defaults to (0).
- theta_ta=(0): <**tuple**> The parameters of temporal data augmentations according to the order in *temporal_aug*. Defaults to (0).
```
fd.loadFuel(name='test-img',
batch_size=4,
key_data='image',
data_path='/Users/Seria/Desktop/nebulae/test/img/image.hdf5',
width=200, height=200,
resol_ratio=0.5,
spatial_aug='brightness,gamma_contrast',
p_sa=(0.5, 0.5), theta_sa=(0.2, 1.2))
```
### You can edit properties to change the way you fetch batch and process data.
```
fd.propertyEdit(dataname='test-img', name='test', batch_size=2)
```
### Passing a dictionary of changed parameters is equivalent.
```
config = {'name':'test', 'batch_size':2}
fd.propertyEdit(dataname='test-img', config=config)
```
### Here are three useful functions:
### stepsPerEpoch() returns how many steps you should take to iterate over all data.
### currentEpoch() returns which epoch you are currently.
### nextBatch() return a dictionary containing a batch of data, labels and other information.
### **N.B.** Sometimes you have no need to explicitly call the functions above unless you regard FuelDepot as an independent tool for your own use.
```
for s in range(fd.stepsPerEpoch('test')):
batch = fd.nextBatch('test')
print(fd.currentEpoch('test'), batch['label'])
```
### Call unloadFuel(dataname) to unmount dataset named "dataname" on your FuelDepot.
```
fd.unloadFuel(name='test')
```

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.6.21

Sep 18, 2023

0.6.20

Aug 20, 2023

0.6.19

Aug 6, 2023

0.6.18

Aug 3, 2023

0.6.17

Jul 30, 2023

0.6.16

Jul 13, 2023

0.6.15

Jul 5, 2023

0.6.14

Jun 24, 2023

0.6.13

May 24, 2023

0.6.12

May 17, 2023

0.6.11

May 16, 2023

0.6.10

May 15, 2023

0.6.9

May 14, 2023

0.6.7

May 4, 2023

0.6.6

Apr 29, 2023

0.6.5

Apr 29, 2023

0.6.4

Feb 28, 2023

0.6.3

Jan 17, 2023

0.6.2

Dec 17, 2022

0.6.1

Sep 9, 2022

0.6.0

Sep 5, 2022

0.5.38

Jul 22, 2022

0.5.37

Jul 17, 2022

0.5.36

Jun 17, 2022

0.5.35

Jun 10, 2022

0.5.34

Jun 7, 2022

0.5.33

Jun 7, 2022

0.5.32

Jun 7, 2022

0.5.31

Jun 6, 2022

0.5.30

May 21, 2022

0.5.29

May 19, 2022

0.5.28

May 17, 2022

0.5.27

May 16, 2022

0.5.26

May 10, 2022

0.5.25

Apr 6, 2022

0.5.24

Apr 4, 2022

0.5.23

Mar 14, 2022

0.5.22

Mar 11, 2022

0.5.21

Mar 10, 2022

0.5.20

Mar 10, 2022

0.5.19

Mar 9, 2022

0.5.18

Mar 8, 2022

0.5.17

Feb 25, 2022

0.5.16

Feb 21, 2022

0.5.15

Jan 29, 2022

0.5.14

Jan 24, 2022

0.5.13

Dec 26, 2021

0.5.12

Dec 19, 2021

0.5.11

Dec 16, 2021

0.5.10

Oct 20, 2021

0.5.9

Sep 25, 2021

0.5.8

Sep 16, 2021

0.5.7

Sep 15, 2021

0.5.6

Sep 14, 2021

0.5.5

Aug 10, 2021

0.5.4

Aug 2, 2021

0.5.2

May 27, 2021

0.5.1

Feb 9, 2021

0.5.0

Jan 27, 2021

0.4.20

Jan 27, 2021

0.4.19

Jan 26, 2021

0.4.18

Jan 22, 2021

0.4.17

Jan 22, 2021

0.4.15

Dec 14, 2020

0.4.14

Dec 8, 2020

0.4.13

Dec 2, 2020

0.4.12

Nov 30, 2020

0.4.11

Nov 28, 2020

0.4.9

Nov 25, 2020

0.4.8

Nov 25, 2020

0.4.7

Nov 23, 2020

0.4.6

Nov 23, 2020

0.4.5

Nov 18, 2020

0.4.4

Nov 17, 2020

0.4.3

Oct 26, 2020

0.4.1

Oct 22, 2020

0.4.0

Oct 21, 2020

0.3.2

Jul 21, 2020

0.3.1

Feb 29, 2020

0.3.0

Nov 20, 2019

0.2.5

Oct 28, 2019

0.2.4

Oct 24, 2019

0.2.3

Oct 24, 2019

0.2.2

Oct 22, 2019

0.2.1

Oct 22, 2019

0.2.0

Oct 21, 2019

0.1.21

Mar 28, 2019

0.1.20

Mar 26, 2019

0.1.19

Mar 25, 2019

0.1.18

Mar 22, 2019

0.1.17

Mar 20, 2019

0.1.16

Mar 12, 2019

0.1.15

Mar 12, 2019

0.1.14

Mar 7, 2019

0.1.13

Mar 6, 2019

0.1.12

Feb 19, 2019

0.1.11

Feb 12, 2019

0.1.10

Jan 28, 2019

0.1.9

Jan 15, 2019

0.1.8

Jan 15, 2019

This version

0.1.7

Jan 15, 2019

0.1.6

Jan 14, 2019

0.1.5

Jan 8, 2019

0.1.4

Jan 8, 2019

0.1.1

Jan 7, 2019

0.1.0

Jan 6, 2019

0.0.17

Dec 25, 2018

0.0.16

Dec 21, 2018

0.0.14

Dec 20, 2018

0.0.13

Dec 20, 2018

0.0.12

Dec 16, 2018

0.0.10

Dec 14, 2018

0.0.4

Dec 11, 2018

0.0.1

Nov 8, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nebulae-0.1.7.tar.gz (27.7 kB view hashes)

Uploaded Jan 15, 2019 Source

Built Distribution

nebulae-0.1.7-py3-none-any.whl (41.6 kB view hashes)

Uploaded Jan 15, 2019 Python 3

Hashes for nebulae-0.1.7.tar.gz

Hashes for nebulae-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`857ea9707a1b591aeb63f9f0ce7ddcd45b1cc7ebb2e6b0d918bf64f2a469d88f`
MD5	`9ebb31d5f294e4b630c890863b4b0c96`
BLAKE2b-256	`37ac86068905a86b0084050f45decb0530ce09d185e5fa539d6adcb4a32ea478`

Hashes for nebulae-0.1.7-py3-none-any.whl

Hashes for nebulae-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`47f544960997c3299f59d48fe1aaa4cdf25c54f0c7c175b5375569bd4e16fba6`
MD5	`ccf644dcdd59c896ac01cc07e55a420d`
BLAKE2b-256	`b701f94e5a7b4d81f0004aff55454a477d32a0b3b3eac32e3fe3e9bbb0e5fb16`