Project description

DoF - Deep Model Core Output Framework

What is this?

DoF is a highly scalable dataset format which helps deep learning scientist to work with foreign and/or sensitive data. DoF provides fast dataset sharing and data-secure at the same time.

How it works?

Basic workflow in theory looks like following consists of two parts.

Create DoF dataset

Collect or find an original dataset.
Make the needed data augmentation and preprocessing.
Choose a pre-trained model to use. (Most use cases have their best prectices now.)
Replace the final classifier or quantifier part of the architecture with empty layer(s).
Forward the whole dataset through the pre-trained model core.
Save and publish the outputs in DoF file.

Use DoF dataset

Load the DoF dataset
Build your own classifier module
Train your classifier module

Dataflow

The most significant difference between normal training process and training process with DoF is that the characteristics of the raw input data and data augmentation, preprocessing must be separately noticed for further use. This additional data has both common and unique level in the whole dataset.

Advantages of DoF

Secure and Private

Using only huge amount of fully connected nodes makes harder to calculate the original values of input data. Some other processing method makes irreversible destructions of original data like the use of pooling layers. Working with image has a unique approach: the original values does not equals with raw data since a raw image was transformed and normalized before processing. The big and complex networks and pooling layers include the possibility of security. This can be used to protect personal, sensitive or health-related data. In Europe General Data Protection Regulation (GDPR) makes strict border of using data which come from European people. DoF helps to transfer and share data across countries without conflict with GDPR or any other data protection regulation.

Efficient

At working with pre-trained models a common solution is to simply cut the classifier and change it with another one. This is not efficient since the frozen (not trained) core of the pre-trained model performs the same calculations over and over again in each epoch. With DoF the result of not trained model can be saved. Only trained layers perform new calculations in the epochs. This is much less time consuming depending on the the size rate between pre-trained model core and the custom classifier.

In most cases storing datasets in DoF is effective in storage size than storing datasets as their own. The size of the data can be precisely estimated since it depends on the shape of the pre-trained model core output only.

Requirements

The python module DoF requires nothing than libraries from the Standard Library.

What's new in DoF 2.0.0

New structure for DoF files
New DoF file type: DoFJSON
New program structure: core.py, data.py, datamodel.py, file.py, error.py, information.py, services.py, storage.py
Use native typing for every input parameters (>= Python 3.7)
Use init.py
Dataset information section can be rewritten
DoF is finally final

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

2.0.1

Apr 28, 2021

2.0.0

Apr 24, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dofpy-2.0.1.tar.gz (39.7 kB view hashes)

Uploaded Apr 28, 2021 Source

Built Distribution

dofpy-2.0.1-py3-none-any.whl (43.4 kB view hashes)

Uploaded Apr 28, 2021 Python 3

Hashes for dofpy-2.0.1.tar.gz

Hashes for dofpy-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`a5b423fb133c22400349ba51f4dd2bbcaaeec2fa7a83c9cd89d9bd432093d571`
MD5	`47cf65a7821c5ed44283074671dd2e65`
BLAKE2b-256	`0089d56d9656547dd95c0bda65af6183863f0341828581e7f57b918d9bf3f306`

Hashes for dofpy-2.0.1-py3-none-any.whl

Hashes for dofpy-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2965bb09b6635e3549269ce967638f80435fa65504de3bf790160c66263abc38`
MD5	`1c444f3a1e01d8f64ac9adabf25ae9e8`
BLAKE2b-256	`3c9d8a6cbc388ac2a33b3c635bb68c7be42fcffe6a8784c00d070fa2f5280e38`