Skip to main content

No project description provided

Project description

PyPI version Build Status Coverage Status GitHub Codacy Badge

Overview

This project is aimed to serve as an utility tool for the preprocessing, training and extraction of entity embeddings through Neural Networks using the Keras framework. It's still under construction, so please use it carefully.

Installation

The installation is pretty simple if you have a virtualenv already installed on your machine. If you don't please rely to VirtualEnv official documentation.

pip install entity-embeddings-categorical

Documentation

Besides the docstrings, major details about the documentation can be found here.

Testing

This project is inteded to suit most of the existent needs, so for this reason, testability is a major concern. Most of the code is heavily tested, along with Travis as Continuous Integration tool to run all the unit tests once there is a new commit.

Usage

The usage of this utility library is provided in two modes: default and custom. In the default configuration, you can perform the following operations: Regression, Binary Classification and Multiclass Classification.

If your data type differs from any of these, you can feel free to use the custom mode, where you can define most of the configurations related to the target processing and output from the neural network.

Default mode

The usage of the default mode is pretty straightforward, you just need to provide a few parameters to the Config object:

So for creating a simple embedding network that reads from file sales_last_semester.csv, where the target name is total_sales, with the desired output being a binary classification and with a training ratio of 0.9, our Python script would look like this:

    config = Config.make_default_config(csv_path='sales_last_semester.csv',
                                        target_name='total_sales',
                                        target_type=TargetType.BINARY_CLASSIFICATION,
                                        train_ratio=0.9)


    embedder = Embedder(config)
    embedder.perform_embedding()

Pretty simple, huh?

A working example of default mode can be found here as a Python script.

Custom mode

If you intend to customize the output of the Neural Network or even the way that the target variables are processed, you need to specify these when creating the configuration object. This can be done by creating a class that extend from TargetProcessor and ModelAssembler.

A working example of custom configuration mode can be found here.

Visualization

Once you are done with the training of your model, you can use the module visualization_utils in order to create some visualizations from the generated weights as well as the accuraccy of your model.

Below are some examples created for the Rossmann dataset:

Weights for store id embedding

Troubleshooting

In case of any issue with the project, or for further questions, do not hesitate to open an issue here on GitHub.

Contributions

Contributions are really welcome, so feel free to open a pull request :-)

TODO

  • Allow to use a Pandas DataFrame instead of the csv file path;

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

entity_embeddings_categorical-0.6.7.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file entity_embeddings_categorical-0.6.7.tar.gz.

File metadata

  • Download URL: entity_embeddings_categorical-0.6.7.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.6.7

File hashes

Hashes for entity_embeddings_categorical-0.6.7.tar.gz
Algorithm Hash digest
SHA256 d952e3dfd2162005f7dae03f0ab319aa378266494987002f4b4ed555e2809de2
MD5 6de998a07957cb307031bb9251b57c92
BLAKE2b-256 5abfeaddeb7d08b75f1ea1f58f8e2facfaa6f41b09dbc6c13f3f2074dd58f58b

See more details on using hashes here.

File details

Details for the file entity_embeddings_categorical-0.6.7-py3-none-any.whl.

File metadata

  • Download URL: entity_embeddings_categorical-0.6.7-py3-none-any.whl
  • Upload date:
  • Size: 23.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.6.7

File hashes

Hashes for entity_embeddings_categorical-0.6.7-py3-none-any.whl
Algorithm Hash digest
SHA256 68f4e8a82baa1ad04aad47751f27539bbe1ef4fe2b6a8d4a7a91f27a69d4446c
MD5 2f57e438fbb902c0fd7eb9d04faf0873
BLAKE2b-256 061d9d4329a476490123d1884d3faac4a2d5899244e0d023a33c6f9a212e7c08

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page