Skip to main content

Generation base dependency

Project description

genbase logo

Generation base dependency

PyPI Python_version Build_passing License


Base functions, generation functions and generic wrappers.

© Marcel Robeer, 2021-2022

Module overview

Module Description
genbase Readable data representations and meta information class.
genbase.data Wrapper functions for working with data.
genbase.decorator Base support for decorators.
genbase.internationalization i18n internationalization.
genbase.mixin Mixins for seeding (reproducibility) and state machines.
genbase.model Wrapper functions for working with machine learning models.
genbase.ui Extensible user interfaces (UIs) for genbase dependencies.

Installation

Method Instructions
pip Install from PyPI via pip3 install genbase.
Local Clone this repository and install via pip3 install -e . or locally run python3 setup.py install.

Releases

genbase is officially released through PyPI.

See CHANGELOG.md for a full overview of the changes for each version.

Packages using genbase


T_xt explainability logo

text_explainability provides a generic architecture from which well-known state-of-the-art explainability approaches for text can be composed. This modular architecture allows components to be swapped out and combined, to quickly develop new types of explainability approaches for (natural language) text, or to improve a plethora of approaches by improving a single module. The text_explainability package is available through PyPI and fully documented at https://marcelrobeer.github.io/text_explainability/.


T_xt sensitivity logo

text_explainability can be extended to also perform sensitivity testing, checking for machine learning model robustness and fairness. The text_sensitivity package is available through PyPI and fully documented at https://marcelrobeer.github.io/text_sensitivity/.


API

genbase

Readable data representations and meta information class.

Class Description
Readable Ensure that a class has a readable representation.
Configurable Adds working with configs (.from_config(), from_json(), from_yaml(), ..., read_json(), ..., to_yaml()) to a class.
MetaInfo Adds type, subtype, callargs and other meta descriptors to a class (subclass of Configurable).

Example:

>>> from genbase import MetaInfo

>>> class ReturnCls(MetaInfo):
...     def __init__(self, value, **kwargs):
...         super().__init__(self,
...                          type='special_test',
...                          subtype='special',
...                          **kwargs)
...         self.value = value
...
...     @property
...     def content(self):
...          return {'value': self.value}

>>> obj = ReturnCls(value=5)
>>> obj.to_config()
{'META': {'type': 'special_test',
          'subtype': 'special'},
 'CONTENT': {'value': 5}}

genbase.data

Wrapper functions for working with data.

Function Description
import_data() Import dataset into an instancelib.Environment (containing instances and ground-truth labels).
train_test_split() Split a dataset into training and test data.

Examples: Import from an online .csv file for the BBC News dataset with data in the 'text' column and labels in 'category':

>>> from genbase import import_data
>>> import_data('https://storage.googleapis.com/dataset-uploader/bbc/bbc-text.csv',
...             data_cols='text', label_cols='category')
TextEnvironment()

Convert a pandas DataFrame to instancelib Environment:

>>> from genbase import import_data
>>> import pandas as pd
>>> df = pd.read_csv('./Downloads/bbc-text.csv')
>>> import_data(df, data_cols=['text'], label_cols=['category'])
TextEnvironment()

Download a .zip file of the Drugs.com review dataset and convert each file in the ZIP to an instancelib Environment:

>>> from genbase import import_data
>>> import_data('https://archive.ics.uci.edu/ml/machine-learning-databases/00462/drugsCom_raw.zip',
...             data_cols='review', label_cols='rating')
TextEnvironment(named_providers=['drugsComTest_raw.tsv', 'drugsComTrain_raw.tsv'])

Convert a huggingface Dataset (SST2 in Glue) to an instancelib Environment:

>>> from genbase import import_data
>>> from datasets import load_dataset
>>> import_data(load_dataset('glue', 'sst2'), data_cols='sentence', label_cols='label')
TextEnvironment(named_providers=['test', 'train', 'validation'])

genbase.decorator

Base support for decorators.

Decorator Description
@add_callargs Decorator that passes __callargs__ to a function if available. Useful in conjunction with MetaInfo.

Example:

>>> from genbase import MetaInfo, add_callargs

>>> class ReturnCls(MetaInfo):
...     def __init__(self, value, callargs=None, **kwargs):
...         super().__init__(self,
...                          type='special_test',
...                          subtype='special',
...                          callargs=callargs,
...                          **kwargs)
...         self.value = value
...
...     @property
...     def content(self):
...          return {'value': self.value}

>>> @add_callargs
... def example_fn(x: int, y: int, z: int = 5, t='str', **kwargs):
...     callargs = kwargs.pop('__callargs__', None)
...     return ReturnCls(value=x + y + z, callargs=callargs)

>>> example_fn(x=1, y=2).callargs
{'x': 1, 'y': 2, 'z': 5, 't': 'str'}

genbase.internationalization

i18n internationalization.

Function Description
get_locale() Get current locale.
set_locale() Set current locale .
translate_list() Get a list based on locale, as defined in the './locale' folder.
translate_string() Get a string based on locale, as defined in the './locale' folder.

Example:

>>> from genbase.internationalization import set_locale, translate_list
>>> set_locale('en')
>>> translate_list('stopwords')
['a', 'an', 'the']

>>> set_locale('nl')
>>> translate_list('stopwords')
['de', 'het', 'een']

genbase.mixin

Mixins for seeding (reproducibility) and state machines.

Class Description
SeedMixin Adds working with ._seed and ._original_seed for reproducibility.
CaseMixin Adds working with title-, sentence-, upper- and lowercase for random data generation.

Example:

>>> from genbase.mixin import SeedMixin
>>> class RandomCls(SeedMixin):
...     def __init__(self, seed: int = 0):
...         self._seed = self._original_seed = seed

>>> rc = RandomCls(seed=10)
>>> rc.seed
10

>>> rc._seed += 20
>>> rc.seed
30

>>> rc._original_seed
10

genbase.model

Wrapper functions for working with machine learning models.

Function Description
import_data() Import a model with instancelib or instancelib-onnx.

Examples: Make a scikit-learn text classifier and train it on SST2

>>> from genbase import import_data, import_model
>>> from datasets import load_dataset
>>> ds = import_data(load_dataset('glue', 'sst2'), data_cols='sentence', label_cols='label')
>>> from sklearn.pipeline import Pipeline
>>> from sklearn.naive_bayes import MultinomialNB
>>> from sklearn.feature_extraction.text import TfidfVectorizer
>>> pipeline = Pipeline([('tfidf', TfidfVectorizer()),
...                      ('clf', MultinomialNB())])
>>> import_model(pipeline, ds, train='train')
SklearnDataClassifier()

Load a pretrained ONNX model with labels 'Bedrijfsnieuws', 'Games' and 'Smartphones'

>>> from genbase import import_model
>>> import_model('data-model.onnx', label_map={0: 'Bedrijfsnieuws', 1: 'Games', 2: 'Smartphones'})
SklearnDataClassifier()

genbase.ui

Extensible user interfaces (UIs) for genbase dependencies.

Function Description
get_color() Get color from a matplotlib colorscale.
plot.matplotlib_available() Check if matplotlib is installed.
plot.plotly_available() Check if plotly is installed.
notebook.format_label() Format label as title.
notebook.format_instances() Format multiple instancelib instances.
notebook.is_interactive() Check if the environment is interactive (Jupyter Notebook).
Class Description
plot.ExpressPlot Plotter for plotly.express.
notebook.Render Base class for rendering configs (configuration dictionaries).

Example:

>>> from genbase.ui.notebook import Render
>>> class CustomRender(Render):
...     def __init__(self, *configs):
...         super().__init__(*configs)
...         self.default_title = 'My Custom Explanation'
...         self.main_color = '#ff00000'
...         self.package_link = 'https://git.io/text_explainability'
...
...     def format_title(self, title: str, h: str = 'h1', **renderargs) -> str:
...         return f'<{h} style="color: red;">{title}</{h}>'
...
...     def render_content(self, meta: dict, content: dict, **renderargs):
...         type = meta['type'] if 'type' in meta else ''
...         return type.replace(' ').title() if 'explanation' in type else type

>>> from genbase import MetaInfo
>>> NiceCls(MetaInfo):
...     def __init__(self, **kwargs):
...         super().__init__(renderer=CustomRenderer, **kwargs)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genbase-0.2.8.tar.gz (36.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genbase-0.2.8-py3-none-any.whl (29.4 kB view details)

Uploaded Python 3

File details

Details for the file genbase-0.2.8.tar.gz.

File metadata

  • Download URL: genbase-0.2.8.tar.gz
  • Upload date:
  • Size: 36.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for genbase-0.2.8.tar.gz
Algorithm Hash digest
SHA256 6777e923338706a6acee1c70c89d4b516fe2964061ae488ce531558cd20df579
MD5 2b646f6daae3c328def781066324e49f
BLAKE2b-256 6004ee26c5a3fee67818575ad1d2d231e6f9798ce4595ab56d3a832fa475416e

See more details on using hashes here.

File details

Details for the file genbase-0.2.8-py3-none-any.whl.

File metadata

  • Download URL: genbase-0.2.8-py3-none-any.whl
  • Upload date:
  • Size: 29.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for genbase-0.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 2349b099156ca4f90c27c3fed554d63e84e6c7f70f624f4d8e7c342d13b440c0
MD5 6ddaeb19152c97bdf8550798077e65a4
BLAKE2b-256 39a01b0c65d4170d787ec75e9346eb8a83caf2976423b3429e9c8fd600138223

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page