Skip to main content

Natural language processing utilities and examples for the book Natural Language Processing in Action (nlpia) 2nd Edition by Hobson Lane and Maria Dyshel.

Project description

nlpia2

codecov GitLab CI

Official code repository for the book Natural Language Processing in Action, 2nd Edition by Maria Dyshel and Hobson Lane at Tangible AI for Manning Publications. It would not have happened without the generous work of contributing authors.

To get the most of this repository, you need to do two things.

  1. Clone the repository to your local machine if you want to execute the code locally or want local access to the data (recommended).
  2. Create an environment that has all the helpful/needed modules for Natural Language Processing In Action, 2nd Edition.

Clone the Repository

If you're currently viewing this file on gitlab, and want in the future to access the data and code local to your machine, you may clone this repository to your local machine. Navigate to your preferred directory to house the local clone (for example, you local git directory) and execute:

git clone git@gitlab.com:prosocialai/nlpia2

Create a Conda Environment

To use the various packages in vogue with today's advanced NLP referenced in the NLPIA 2nd Edition book, such as PyTorch and SpaCy, you need to install them in a conda environment. To avoid potential conflics of such packages and their dependencies with your other python projects, it is a good practice to create and activate a new conda environment.

Here's how we did that for this book.

  1. Make sure you have Anaconda3 installed. Make sure you can run conda from within a bash shell (terminal). The conda --version command should say something like '4.10.3.

  2. Update conda itself. Keep current the conda package, which manages all other packages. Your base environment is most likely called base so you can execute conda update -n base -c defaults conda to bring that package up to date. Even if base is not the activated environment at the moment, this command as presented will update the conda package in the base environment. This way, next time you use the conda command, in any environment, the system will use the updated conda package.

  3. Create a new environment and install the variety of modules needed in NLPIA 2nd Edition.

There are two ways to do that.

Use the script already provided in the repository (nlpia2/src/nlpia2/scripts/conda_install.sh)

If you have cloned the repository, as instructed above, you already have a script that will do this work. From the directory housing the repository, run cd nlpia2/src/nlpia2/scripts/ and from there run bash conda_install.sh

Or manually execute portions of the script as follows

First, create a new environment (or activate it if it exists)

# create a new environment named "nlpia2" if one doesn't already exist:
conda activate nlpia2 \
    || conda create -n nlpia2 -y 'python==3.9.7' \
    && conda activate nlpia2

Once that completes, install all of nlpia2's conda dependences if they aren't already installed:

conda install -c defaults -c huggingface -c pytorch -c conda-forge -y \
    emoji \
    ffmpeg \
    glcontext \
    graphviz \
    huggingface_hub \
    jupyter \
    lxml \
    manimpango \
    nltk \
    pyglet \
    pylatex \
    pyrr \
    pyopengl \
    pytest \
    pytorch \
    regex \
    seaborn \
    scipy \
    scikit-learn \
    sentence-transformers \
    statsmodels \
    spacy \
    torchtext \
    transformers \
    wikipedia \
    xmltodict

Finally, install via pip any packages not available through conda channels. In such scenarios it is generally a better practice to apply all pip installs after all conda installs. Furthermore, to ensure the pip installation is properly configured for the python version used in the conda environment, rather than use pip or pip3, activate the environment and invoke pip by using python -m pip.

conda activate nlpia2
python -m pip install manim manimgl

Ready, Set, Go!

Congratulations! You now have the nlpia2 repository cloned which gives you local access to all the data and scripts need in the NLPIA Second Edition book, and you have created a powerful environment to use. When you're ready to type or execute code, check if this environment is activated. If not, activate by executing:

conda activate nlpia2

And off you go tackle some serious Natural Language Processing, in order to make the world a better place for all.

Run a jupyter notebook server within docker: jupyter-repo2docker --editable .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlpia2-0.0.20.tar.gz (21.3 MB view details)

Uploaded Source

Built Distribution

nlpia2-0.0.20-py3-none-any.whl (21.7 MB view details)

Uploaded Python 3

File details

Details for the file nlpia2-0.0.20.tar.gz.

File metadata

  • Download URL: nlpia2-0.0.20.tar.gz
  • Upload date:
  • Size: 21.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.7 Linux/5.15.0-53-generic

File hashes

Hashes for nlpia2-0.0.20.tar.gz
Algorithm Hash digest
SHA256 4241f371f9c3ae69efc0b1d50d24b431f5564cc6829d9c479058771ed1783131
MD5 666028d204af346f7c5e72b6729206f8
BLAKE2b-256 2bc0a4a403b3333d34087a80ae3b1b0ea828c3a12eb98b72a1e11dd5fdd94dee

See more details on using hashes here.

Provenance

File details

Details for the file nlpia2-0.0.20-py3-none-any.whl.

File metadata

  • Download URL: nlpia2-0.0.20-py3-none-any.whl
  • Upload date:
  • Size: 21.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.7 Linux/5.15.0-53-generic

File hashes

Hashes for nlpia2-0.0.20-py3-none-any.whl
Algorithm Hash digest
SHA256 33b5323c91ae72975e2f4868dd53d571cfd815c3ffb61c4920c4b2e0c87c3e9c
MD5 8a5c976787800ef283f20467c20ff9fe
BLAKE2b-256 403690bfe606eb43c8a73e05f6c2dce21a59f506be24d2868be6df11f5aec327

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page