Skip to main content

A framework for embedding evaluation automation and visualization.

Project description

EmbEval is a framework that aims to provide a way to evaluate an arbitrary amount of word embeddings in an arbitrary amount of tasks, in parallel.

To aid with the interpretability of the results, embeval resorts to graphs to visualize the performance of the different type of embeddings across each task.

Getting Started

Installation

Install embeval with pip:

pip3 install embeval

Usage (Command Line)

embeval --help
    Usage: embeval [OPTIONS] COMMAND [ARGS]...

    Options:
        --help  Show this message and exit.

    Commands:
        semantic-similarity

 embeval semantic-similarity --help
     Usage: embeval semantic-similarity [OPTIONS] EMBEDDING_DIR TESTSET_DIR

     Options:
         --workers INTEGER               Number of worker processes to use.
         --output_path TEXT              Path to write output files to.
         --output_format [text|graph|both]
         --help                          Show this message and exit.

 embeval semantic-similarity --output_path output/ embeddings/ testsets/

Using/Extending EmbEval

To extend the code to include tasks not provided in the current implementation (contributions would be most welcome), n concepts must be implemented:

  • Command (See Semantic Similarity Command) – This is what will make your task available under the CLI and also will command the flow of execution when called upon. Click is used as the CLI package. The entrypoint for an extended application must import the main cli object and register all the available commands (See main).

  • Processing Pipeline (See generics and Semantic Similarity Pipeline – This is where the producer, processor and consumer are implemented to execute tasks. The implementation makes use of the library and methodology of pseq.

  • Store (See Semantic Similarity Store) – Simple object to keep track of evaluation results obtained during the processing pipeline.

  • Task (See Semantic Similarity Task) – A task object which encapsulates needed information to be shared in the pipeline, such as paths to files.

  • Visualization (See text visualization) – Defines a method of visualization.

Plans

  • ☐ Finish Semantic Similarity visualization.

  • ☐ Integrate GLUE tasks via jiant framework.

License

Distributed under GPL-3.0 License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embeval-0.1.3.tar.gz (5.4 kB view details)

Uploaded Source

File details

Details for the file embeval-0.1.3.tar.gz.

File metadata

  • Download URL: embeval-0.1.3.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.4

File hashes

Hashes for embeval-0.1.3.tar.gz
Algorithm Hash digest
SHA256 255274fd3609cab8c801bca631c4af6e32ab91ae8f7ae5f774dfef53e05e20ea
MD5 7d9de35f415531d516823623b8167959
BLAKE2b-256 e385f585989180dee0f748c5c10409fb512fb34bc11d9e4e7051e7575991aca2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page