Skip to main content

Deep nlp library

Project description

NLP Deep Learning Framework

This is a deepl learning framework for classification and seq2seq tasks.

Installation

pip install deep-nlp

Example Project

Structure

├── data              --> containing the trainings and validation data
|   ├── train.csv     --> training dataset
|   └── val.csv       --> validation dataset
├── Experiment.py     --> containing the model and training logic
└── dataset.py        --> containing the Dataset object 

Dataset.py

from torch.utils.data import Dataset
import pandas as pd

class ExampleDataset(Dataset):

  def __init__(self, split : str):
    self.data = pd.read_csv(f'{split}.csv')

  def __len__(self):
    return len(self.data)

  def __getitem__(self, idx):
    return self.data.iloc[idx]

Experiment.py

from deep_nlp import Experiment, unpack
from dataset import ExampleDataset
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch

class ClassificationExperiment(Experiment):

  def get_tokenizer(self):
    tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')
    return tokenizer

  def get_model(self):
    model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
    return model

  def batch_fn(self, batch):
    source, target = zip(*batch)
    source_inp = self.tokenizer(source, padding=True, return_tensors=True)
    target = torch.tensor(target)
    return unpack(source_inp, target)

def run_experiment():
    experiment = ClassificationExperiment(
        80,  # batch size
        20,  # number of epochs
        ExampleDataset,
        gpus=-1,  # use all available gpus
        lr=2.65e-5,
        weight_decay=4e-3,
        name='example_run'  # name for mlflow
    )
    experiment.run()

if __name__ == '__main__':
    run_experiment()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deep-nlp-0.0.1.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

deep_nlp-0.0.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file deep-nlp-0.0.1.tar.gz.

File metadata

  • Download URL: deep-nlp-0.0.1.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.5.0.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.5

File hashes

Hashes for deep-nlp-0.0.1.tar.gz
Algorithm Hash digest
SHA256 79739e6fe43463240c734ee9e30595896b1777250da2ace7c0de404190cadf17
MD5 ca491b0ce99a5410e095e889a655cb6b
BLAKE2b-256 01cd048a8b0609796e80ab4d309a682c2a3dbb9b48f66e0d05de1fcf62d1cda6

See more details on using hashes here.

File details

Details for the file deep_nlp-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: deep_nlp-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.5.0.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.5

File hashes

Hashes for deep_nlp-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e5c6d6b4a5fc667bce03b7b568d2efeb96a413d61fa63df699cb98ba2f04e2b5
MD5 f845b35771a62abf0c2e7ff128e90a08
BLAKE2b-256 5a14e26305d10e90b7b73352736b108a4aea388d0e63aed381d5fd7a451f8996

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page