Skip to main content

Interactive NLP course labs for Jupyter, Colab, and Deepnote

Project description

README

The course website is located here. Lecture materials, assignments, quizzes, etc. can be accessed at that link. You will need an API key to submit notebooks and that will be provided to you via email.

This site contains jupyter notebooks, data, and other code artifacts associated with this course.

Choosing a Notebook Environment

Most work will not require the use of GPUs. You can probably get away with not using them at all, unless you have a particular desire to do so.

Google Colab - single notebook experience

If you prefer:

  • working within a single notebook
  • are already comfortable with Google Colab
  • don’t mind re-installing dependencies on re-start
  • need access to GPUs

you may prefer Google Colab.

Deepnote

If you prefer:

  • easy install, more persistence of dependencies
  • large number of system integrations
  • Dataframe charts, interactive widgets, dashboards, app deployment
  • realtime collaboration

you may prefer Deepnote.

You will need to create a free account and then request an education plan. To use GPUs or higher performance machines, you must add a payment method - but you do not need to upgrade the plan.

All students will be given links to deepnote for labs.

Local JupyterLab / Notebook

If you are already comfortable in Jupyter in your local environment and:

  • you want full control of your machine and environment
  • persistence of dependencies
  • and don’t mind dealing with management of your environment

you may prefer local Jupyter. The downside is that there is no GPU access unless you know how to set up something like a remote modal function that uses GPU.

Installation

For Students (Google Colab)

To use Colab and submit for credit:

  • Download a notebook from GitHub
  • Upload a local copy of the notebook to Colab
  • Save a copy in Drive
  • Ensure the file name matches the variable NOTEBOOK_NAME in the section “Submit Notebook for Credit”.

Saving to Drive and matching the filename are only required if you are submitting for credit.

You will need to add the SUBMIT_API_KEY to environmental variables.

For Deepnote

Every week, there will be new link posted for a Deepnote project. At least the first time, when you click on the link you will be asked to login or sign up to see the project. If you sign up, you’ll get a free 14-trial of the Team plan, and from there you can request the education plan.

  • When the project opens, click Duplicate (top right).
  • This creates your own private copy of the lab.
  • You will need to add the SUBMIT_API_KEY to environmental variables.

For Local Development

If you want to run notebooks locally:

# Clone the repository
git clone https://github.com/su-dataAI/data401-nlp.git
cd data401-nlp

# If you don't have uv you can:
#curl -LsSf https://astral.sh/uv/install.sh | sh (macOS/Linux) or pip install uv as a fallback

# Create a virtual environment using uv (requires Python 3.11+)
# If you want to use a 3.13+, you will need to upgrade torch to torch>=2.1,<2.6
uv venv --python 3.11

# Activate the virtual environment
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

# Install with all dependencies
uv pip install -e ".[dev,all]"

# Download spaCy model
python -m spacy download en_core_web_sm

# Start Jupyter Lab
jupyter lab

# Add .env file (root or nbs folder)

You will need to git pull when each new lab is posted.

Installation Options

The package supports flexible installation based on your needs:

# Minimal installation (core utilities only)
pip install data401-nlp

# With NLP tools (spaCy, NLTK)
pip install data401-nlp[nlp]

# With transformers and PyTorch
pip install data401-nlp[transformers]

# With API support (FastAPI, Pydantic)
pip install data401-nlp[api]

# Everything (recommended for students)
pip install data401-nlp[all]

Platform Support

✅ Google Colab
✅ Deepnote
✅ Jupyter Lab
✅ Local Python 3.11+

Helper Modules

The package includes several helper modules to make your NLP work easier:

  • data401_nlp.helpers.env - Environment detection and API key loading
  • data401_nlp.helpers.spacy - Automatic spaCy model management
  • data401_nlp.helpers.submit - Assignment submission utilities
  • data401_nlp.helpers.llm - LLM integration helpers

The helper libraries may be updated as the course proceeds.

Contents

Lab Deepnote GitHub
Intro (Jan 15) Open in Deepnote Open In GitHub
EDA (Jan 27) Open in Deepnote Open In GitHub
Regex (3 Feb) Open in Deepnote Open In GitHub
EDA (3 Feb; optional part 2) Open in Deepnote Open In GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data401_nlp-0.0.7.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data401_nlp-0.0.7-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file data401_nlp-0.0.7.tar.gz.

File metadata

  • Download URL: data401_nlp-0.0.7.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for data401_nlp-0.0.7.tar.gz
Algorithm Hash digest
SHA256 cfc29331e0c20ce847a8f447df0214c55bd1dc047645eba4e0dde9c7aa49e870
MD5 37e0d99118ca76ba9019b504cc939e08
BLAKE2b-256 f538523e47d607f285396a0c97fbb5f136c7ab8e910041921459a347b380974d

See more details on using hashes here.

Provenance

The following attestation bundles were made for data401_nlp-0.0.7.tar.gz:

Publisher: release.yaml on su-dataAI/data401-nlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file data401_nlp-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: data401_nlp-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for data401_nlp-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 17287953f577bab8b92c8c07eb40f5b693c46ccee2a9d98200c01c5d3de3733d
MD5 fb15fd5a387228e85cb142e466bf5208
BLAKE2b-256 21826ed35a825391702b7d828c5de413757d823b7e25b37e0f6b9389c6642336

See more details on using hashes here.

Provenance

The following attestation bundles were made for data401_nlp-0.0.7-py3-none-any.whl:

Publisher: release.yaml on su-dataAI/data401-nlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page