Skip to main content

Utilities for training and working with nlp models in pytorch

Project description

xt-nlp

Description

This repo contains common NLP pre/post processing functions, loss functions, metrics, and helper functions.

Installation

From PyPI:

pip install xt-nlp

From source:

git clone https://github.com/XtractTech/xt-nlp.git
pip install ./xt-nlp

Usage

See specific help on a class or function using help. E.g., help(SESLoss).

Defining SES Metrics and Loss

from xt_nlp.metrics import SESF1
from xt_nlp.metrics import SESLoss

eval_metrics = {
   'f1': SESF1(threshold=0.8)
}
loss_fn = SESLoss()

Read BRAT annotations for sequence extraction into data loader

from xt_nlp.utils import get_brat_examples, split_examples, get_features, build_ses_dataloader

# tokenizer = 
# max_sequence_length = 
# doc_stride =
# class_dict = Dictionary mapping classname ==> list of classes to group into this class
# classes = 
# batch_size = 
# workers = 

examples = get_brat_examples(
    datadir='./data/datadir',
    classes=classes
)

train_examples, val_examples = split_examples(examples, train_prop=.9, seed=4000)

train_features = get_features(
    examples=train_examples, 
    tokenizer=tokenizer, 
    all_ans_types=classes, 
    max_seq_len=max_sequence_length,
    doc_stride=doc_stride,
    mode='train'
)

train_loader = build_ses_dataloader(
    train_features, 
    classes, 
    class_dict, 
    batch_size=batch_size,
    workers=workers,
    max_seq_length=max_sequence_length,
    shuffle=True
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xt-nlp-0.2.3.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

xt_nlp-0.2.3-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file xt-nlp-0.2.3.tar.gz.

File metadata

  • Download URL: xt-nlp-0.2.3.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6

File hashes

Hashes for xt-nlp-0.2.3.tar.gz
Algorithm Hash digest
SHA256 ab38582195ffd15b498c33073da7aa9814f924102aa69865af821e19355537ff
MD5 ddcef97526641d1f639c78fd86064049
BLAKE2b-256 955f50d0bdb93321c40c5295ac8591c590f3133eaecaed1290d56e469d64043f

See more details on using hashes here.

File details

Details for the file xt_nlp-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: xt_nlp-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6

File hashes

Hashes for xt_nlp-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d919875d5d2639e70500ec8f3891cc72fca0b2d9cd24e2345f1547409c442527
MD5 a3ff8aa2a7add180f1fcf44f60c50b8f
BLAKE2b-256 1ee4caa1374fcf47bc3998a0bc490fb6acc1da1e8d92c2b03ffe28b0c60fc563

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page