Skip to main content

Utilities for training and working with nlp models in pytorch

Project description

xt-nlp

Description

This repo contains common NLP pre/post processing functions, loss functions, metrics, and helper functions.

Installation

From PyPI:

pip install xt-nlp

From source:

git clone https://github.com/XtractTech/xt-nlp.git
pip install ./xt-nlp

Usage

See specific help on a class or function using help. E.g., help(SESLoss).

Defining SES Metrics and Loss

from xt_nlp.metrics import SESF1
from xt_nlp.metrics import SESLoss

eval_metrics = {
   'f1': SESF1(threshold=0.8)
}
loss_fn = SESLoss()

Read BRAT annotations for sequence extraction into data loader

from xt_nlp.utils import get_brat_examples, split_examples, get_features, build_ses_dataloader

# tokenizer = 
# max_sequence_length = 
# doc_stride =
# class_dict = Dictionary mapping classname ==> list of classes to group into this class
# classes = 
# batch_size = 
# workers = 

examples = get_brat_examples(
    datadir='./data/datadir',
    classes=classes
)

train_examples, val_examples = split_examples(examples, train_prop=.9, seed=4000)

train_features = get_features(
    examples=train_examples, 
    tokenizer=tokenizer, 
    all_ans_types=classes, 
    max_seq_len=max_sequence_length,
    doc_stride=doc_stride,
    mode='train'
)

train_loader = build_ses_dataloader(
    train_features, 
    classes, 
    class_dict, 
    batch_size=batch_size,
    workers=workers,
    max_seq_length=max_sequence_length,
    shuffle=True
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xt-nlp-0.2.6.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

xt_nlp-0.2.6-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file xt-nlp-0.2.6.tar.gz.

File metadata

  • Download URL: xt-nlp-0.2.6.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6

File hashes

Hashes for xt-nlp-0.2.6.tar.gz
Algorithm Hash digest
SHA256 ec6f6b2bb99c2b9f399fe421b7b34700008c3ad40ba3ffd7683de574001cb260
MD5 39016805088668c0df0f729752df6afd
BLAKE2b-256 258f5333c8009b6ffe9971022d20c5492d51536ae2d6a930dd26c35290886fec

See more details on using hashes here.

File details

Details for the file xt_nlp-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: xt_nlp-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6

File hashes

Hashes for xt_nlp-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f20ad6e46baf46b6844790c647c5a97d12f8cf0f55124fe2b39bab38fe546a6f
MD5 6dfe1c58b83c6a0498b40f0e1105af66
BLAKE2b-256 dcd6d1a3276e3a32905143547848a13fef346869396b0961cce74d25889f16f3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page