Utilities for managing nlp models and for processing text-related data at Wellcome Data Labs
Project description
WellcomeML utils
This package contains common utility functions for usual tasks at Wellcome Data Labs. In particular:
modules | description |
---|---|
io | manipulating data, in and out S3, and processing |
ml | wrappers for processing texts, vectorisers and classifiers |
spacy | common utils for converting data form and to spacy/prodigy format |
mis/viz | any other utils, including Wellcome colour palletes |
For more in depth information see the /examples
folder and release notes.
1. Quickstart
Installing from a release wheel: Download the wheel from aws and pip install it:
pip install wellcomeml-2020.1.0-py3-none-any.whl
This will install the "vanilla" package. In order to install the deep-learning functionality (torch/transformers/spacy transformers):
pip install wellcomeml-2020.1.0-py3-none-any.whl[deep-learning]
2. Development
2.1 Build local virtualenv
make
2.2 Build the wheel (and upload to aws s3)
After making changes, in order to buil a new wheel, run:
make dist
2.3 (Optional) Installing from other locations
pip3 install <relative path to this folder>
2.4 Transformers
Some experimental features (currently wellcomeml.ml.SemanticEquivalenceClassifier
) require a version of transformers
that is not compatible with spacy-transformers
. To develop those features:
export WELLCOMEML_ENV=development_transformers
pip install -r requirements_transformers.txt --upgrade
On OSX, ff you get a message complaining about the rust compiler, install and initialise it with:
brew install rustup
rustup-init
3. Example usage of some modules
Examples can be found in the subfolder examples
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for wellcomeml-2020.5.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f2418ddbbaddafa90171466ee9aa96a94a66cfd19980ad1e564b59958bbf63f |
|
MD5 | 139eff7bb542e06d0677f26228ff1cf0 |
|
BLAKE2b-256 | 4c4049699c4442a0ede08f6d2f3a5e0441a2115f1b046b7ea52394a59b2ba787 |
Hashes for wellcomeml-2020.5.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 279d2b7de62ab1f78405545db6aed087fb9d70860fe6729180c7d79a08ebce76 |
|
MD5 | d09c4ff9f4217109361956dedaada82d |
|
BLAKE2b-256 | 28cac3c261abb2b2fb4503f6aaa839a7a0c3af5e5833fe92042156c536c75361 |