Skip to main content

Deep learning utility library for natural language processing that aids in feature engineering and embedding layers.

Project description

DeepZensols Natural Language Processing

PyPI [![Python 3.13][python313-badge]][python313-link] Python 3.12 Build Status

Deep learning utility library for natural language processing that aids in feature engineering and embedding layers from the paper A Deep Learning Natural Language Processing Framework for Experimentation and Reproducibility.

Features:

  • Configurable layers with little to no need to write code.
  • Natural language specific layers:
  • NLP specific vectorizers that generate zensols deeplearn encoded and decoded batched tensors for spaCy parsed features, dependency tree features, overlapping text features and others.
  • Easily swapable during runtime embedded layers as batched tensors and other linguistic vectorized features.
  • Support for token, document and embedding level vectorized features.
  • Transformer word piece to linguistic token mapping.
  • Two full documented reference models provided as both command line and Jupyter notebooks.
  • Command line support for training, testing, debugging, and creating predictions.

Documentation

Obtaining

The easiest way to install the command line program is via the pip installer:

pip3 install zensols.deepnlp

Binaries are also available on pypi.

Usage

The API can be used as is and manually configuring each component. However, this (like any Zensols API) was designed to instantiated with inverse of control using resource libraries.

Component

Components and out of the box models are available with little to no coding. However, this simple example that uses the library's components is recommended for starters. The example is a command line application that in-lines a simple configuration needed to create deep learning NLP components.

Similarly, this example is also a command line example, but uses a masked langauge model to fill in words.

Reference Models

If you're in a rush, you can dive right in to the Clickbate Text Classification reference model, which is a working project that uses this library. However, you'll either end up reading up on the zensols deeplearn library before or during the tutorial.

The usage of this library is explained in terms of the reference models:

The unit test cases are also a good resource for the more detailed programming integration with various parts of the library.

Attribution

This project, or reference model code, uses:

Corpora used include:

Citation

If you use this project in your research please use the following BibTeX entry:

@inproceedings{landes-etal-2023-deepzensols,
    title = "{D}eep{Z}ensols: A Deep Learning Natural Language Processing Framework for Experimentation and Reproducibility",
    author = "Landes, Paul  and
      Di Eugenio, Barbara  and
      Caragea, Cornelia",
    editor = "Tan, Liling  and
      Milajevs, Dmitrijs  and
      Chauhan, Geeticka  and
      Gwinnup, Jeremy  and
      Rippeth, Elijah",
    booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
    month = dec,
    year = "2023",
    address = "Singapore, Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.nlposs-1.16",
    pages = "141--146"
}

Community

Please star this repository and let me know how and where you use this API. Contributions as pull requests, feedback, and any input is welcome.

Changelog

An extensive changelog is available here.

License

MIT License

Copyright (c) 2020 - 2026 Paul Landes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zensols_deepnlp-1.19.0-py3-none-any.whl (131.6 kB view details)

Uploaded Python 3

File details

Details for the file zensols_deepnlp-1.19.0-py3-none-any.whl.

File metadata

File hashes

Hashes for zensols_deepnlp-1.19.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bafe33dcab804dea51c9b8c95195d341b8374a6b861d131e5e2c3cbba86957c9
MD5 0f3e90e40dd74b1a86db34b4856c1297
BLAKE2b-256 95b0edb026cc10e417a2a1b2bce6a7e0a6770229e870a69c37066d91c4970c5b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page