Skip to main content

JupyterLab extension that allows reading of DOCX, PPTX, and RTF documents

Project description

JupyterLab Document Reader Extension

GitHub Actions npm version PyPI version Total PyPI downloads JupyterLab 4 Brought To You By KOLOMOLO

A JupyterLab extension that allows viewing Microsoft Word documents (DOCX, DOC), PowerPoint presentations (PPTX, PPT), and Rich Text Format (RTF) files directly in JupyterLab. The extension automatically converts documents to PDF on-the-fly for seamless viewing without creating persistent files.

Features

  • View DOCX, DOC, RTF, PPTX, and PPT files directly in JupyterLab
  • Automatic conversion to PDF for display (no temporary files created in your workspace)
  • Native PDF rendering in the browser
  • PowerPoint support with text, images, and tables rendered from slides
  • Unicode support with automatic font detection for international characters (Polish, German, French, etc.)
  • Read-only mode to prevent accidental modifications
  • Clean, integrated interface matching JupyterLab's design

Architecture

This extension consists of:

  • Python server extension: Handles document-to-PDF conversion using pure Python libraries (python-docx, python-pptx, reportlab, Pillow)
  • TypeScript frontend extension: Provides the document viewer widget and file type registration

Requirements

  • JupyterLab >= 4.0.0
  • Python >= 3.9
  • No external system dependencies required (pure Python solution)

Install

Simply install the extension with pip:

pip install jupyterlab_doc_reader_extension

All required Python dependencies (python-docx, python-pptx, reportlab, Pillow) will be installed automatically.

Usage

Once installed, simply click on any .docx, .doc, .rtf, .pptx, or .ppt file in the JupyterLab file browser. The extension will automatically:

  1. Convert the document to PDF on the server
  2. Stream the PDF to your browser
  3. Display it in a dedicated viewer tab

No temporary files are created in your workspace - the conversion happens in memory on the server side.

Uninstall

To remove the extension, execute:

pip uninstall jupyterlab_doc_reader_extension

Troubleshoot

If you are seeing the frontend extension, but it is not working, check that the server extension is enabled:

jupyter server extension list

If the server extension is installed and enabled, but you are not seeing the frontend extension, check the frontend extension is installed:

jupyter labextension list

Contributing

Development install

Note: You will need NodeJS to build the extension package.

The jlpm command is JupyterLab's pinned version of yarn that is installed with JupyterLab. You may use yarn or npm in lieu of jlpm below.

# Clone the repo to your local environment
# Change directory to the jupyterlab_doc_reader_extension directory
# Install package in development mode
pip install -e ".[test]"
# Link your development version of the extension with JupyterLab
jupyter labextension develop . --overwrite
# Server extension must be manually installed in develop mode
jupyter server extension enable jupyterlab_doc_reader_extension
# Rebuild extension Typescript source after making changes
jlpm build

You can watch the source directory and run JupyterLab at the same time in different terminals to watch for changes in the extension's source and automatically rebuild the extension.

# Watch the source directory in one terminal, automatically rebuilding when needed
jlpm watch
# Run JupyterLab in another terminal
jupyter lab

With the watch command running, every saved change will immediately be built locally and available in your running JupyterLab. Refresh JupyterLab to load the change in your browser (you may need to wait several seconds for the extension to be rebuilt).

By default, the jlpm build command generates the source maps for this extension to make it easier to debug using the browser dev tools. To also generate source maps for the JupyterLab core extensions, you can run the following command:

jupyter lab build --minimize=False

Development uninstall

# Server extension must be manually disabled in develop mode
jupyter server extension disable jupyterlab_doc_reader_extension
pip uninstall jupyterlab_doc_reader_extension

In development mode, you will also need to remove the symlink created by jupyter labextension develop command. To find its location, you can run jupyter labextension list to figure out where the labextensions folder is located. Then you can remove the symlink named jupyterlab_doc_reader_extension within that folder.

Testing the extension

Server tests

This extension is using Pytest for Python code testing.

Install test dependencies (needed only once):

pip install -e ".[test]"
# Each time you install the Python package, you need to restore the front-end extension link
jupyter labextension develop . --overwrite

To execute them, run:

pytest -vv -r ap --cov jupyterlab_doc_reader_extension

Frontend tests

This extension is using Jest for JavaScript code testing.

To execute them, execute:

jlpm
jlpm test

Integration tests

This extension uses Playwright for the integration tests (aka user level tests). More precisely, the JupyterLab helper Galata is used to handle testing the extension in JupyterLab.

More information are provided within the ui-tests README.

Packaging the extension

See RELEASE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jupyterlab_doc_reader_extension-1.1.12.tar.gz (591.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file jupyterlab_doc_reader_extension-1.1.12.tar.gz.

File metadata

File hashes

Hashes for jupyterlab_doc_reader_extension-1.1.12.tar.gz
Algorithm Hash digest
SHA256 04efd549fa5c92d6094fc7b4e2c51a946bd3cb00b24ef22d4086f60fcd2bfa4d
MD5 244bcfcefc311eb621c50e589788e800
BLAKE2b-256 a3dc4cbb480347710ba7c0320e2fa4a89b1b4a2db60cd27a2f4bb9ed0ff60cc2

See more details on using hashes here.

File details

Details for the file jupyterlab_doc_reader_extension-1.1.12-py3-none-any.whl.

File metadata

File hashes

Hashes for jupyterlab_doc_reader_extension-1.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 eb44fd46f9d2620c608496c245d0528b11627a78639e970dc7820d60892ce1e8
MD5 108545bc952d152b7b02bfbe412bc660
BLAKE2b-256 2ecf756545899466b1d7c8be54ae9fbe6a4b5e488273147917e223fe89540d45

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page