Skip to main content

JupyterLab extension that allows reading of DOCX, PPTX, and RTF documents

Project description

JupyterLab Document Reader Extension

GitHub Actions npm version PyPI version Total PyPI downloads JupyterLab 4 Brought To You By KOLOMOLO

A JupyterLab extension that allows viewing Microsoft Word documents (DOCX, DOC), PowerPoint presentations (PPTX, PPT), and Rich Text Format (RTF) files directly in JupyterLab. The extension automatically converts documents to PDF on-the-fly for seamless viewing without creating persistent files.

Features

  • View DOCX, DOC, RTF, PPTX, and PPT files directly in JupyterLab
  • Automatic conversion to PDF for display (no temporary files created in your workspace)
  • Native PDF rendering in the browser
  • PowerPoint support with text, images, and tables rendered from slides
  • Unicode support with automatic font detection for international characters (Polish, German, French, etc.)
  • Read-only mode to prevent accidental modifications
  • Clean, integrated interface matching JupyterLab's design

Architecture

This extension consists of:

  • Python server extension: Handles document-to-PDF conversion using pure Python libraries (python-docx, python-pptx, reportlab, Pillow)
  • TypeScript frontend extension: Provides the document viewer widget and file type registration

Requirements

  • JupyterLab >= 4.0.0
  • Python >= 3.9
  • No external system dependencies required (pure Python solution)

Install

Simply install the extension with pip:

pip install jupyterlab_doc_reader_extension

All required Python dependencies (python-docx, python-pptx, reportlab, Pillow) will be installed automatically.

Usage

Once installed, simply click on any .docx, .doc, .rtf, .pptx, or .ppt file in the JupyterLab file browser. The extension will automatically:

  1. Convert the document to PDF on the server
  2. Stream the PDF to your browser
  3. Display it in a dedicated viewer tab

No temporary files are created in your workspace - the conversion happens in memory on the server side.

Uninstall

To remove the extension, execute:

pip uninstall jupyterlab_doc_reader_extension

Troubleshoot

If you are seeing the frontend extension, but it is not working, check that the server extension is enabled:

jupyter server extension list

If the server extension is installed and enabled, but you are not seeing the frontend extension, check the frontend extension is installed:

jupyter labextension list

Contributing

Development install

Note: You will need NodeJS to build the extension package.

The jlpm command is JupyterLab's pinned version of yarn that is installed with JupyterLab. You may use yarn or npm in lieu of jlpm below.

# Clone the repo to your local environment
# Change directory to the jupyterlab_doc_reader_extension directory
# Install package in development mode
pip install -e ".[test]"
# Link your development version of the extension with JupyterLab
jupyter labextension develop . --overwrite
# Server extension must be manually installed in develop mode
jupyter server extension enable jupyterlab_doc_reader_extension
# Rebuild extension Typescript source after making changes
jlpm build

You can watch the source directory and run JupyterLab at the same time in different terminals to watch for changes in the extension's source and automatically rebuild the extension.

# Watch the source directory in one terminal, automatically rebuilding when needed
jlpm watch
# Run JupyterLab in another terminal
jupyter lab

With the watch command running, every saved change will immediately be built locally and available in your running JupyterLab. Refresh JupyterLab to load the change in your browser (you may need to wait several seconds for the extension to be rebuilt).

By default, the jlpm build command generates the source maps for this extension to make it easier to debug using the browser dev tools. To also generate source maps for the JupyterLab core extensions, you can run the following command:

jupyter lab build --minimize=False

Development uninstall

# Server extension must be manually disabled in develop mode
jupyter server extension disable jupyterlab_doc_reader_extension
pip uninstall jupyterlab_doc_reader_extension

In development mode, you will also need to remove the symlink created by jupyter labextension develop command. To find its location, you can run jupyter labextension list to figure out where the labextensions folder is located. Then you can remove the symlink named jupyterlab_doc_reader_extension within that folder.

Testing the extension

Server tests

This extension is using Pytest for Python code testing.

Install test dependencies (needed only once):

pip install -e ".[test]"
# Each time you install the Python package, you need to restore the front-end extension link
jupyter labextension develop . --overwrite

To execute them, run:

pytest -vv -r ap --cov jupyterlab_doc_reader_extension

Frontend tests

This extension is using Jest for JavaScript code testing.

To execute them, execute:

jlpm
jlpm test

Integration tests

This extension uses Playwright for the integration tests (aka user level tests). More precisely, the JupyterLab helper Galata is used to handle testing the extension in JupyterLab.

More information are provided within the ui-tests README.

Packaging the extension

See RELEASE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jupyterlab_doc_reader_extension-1.1.9.tar.gz (804.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jupyterlab_doc_reader_extension-1.1.9-py3-none-any.whl (35.3 kB view details)

Uploaded Python 3

File details

Details for the file jupyterlab_doc_reader_extension-1.1.9.tar.gz.

File metadata

File hashes

Hashes for jupyterlab_doc_reader_extension-1.1.9.tar.gz
Algorithm Hash digest
SHA256 3cf5b634222fe0dfb6f2f3372d3beffbaa0f383d90c7b10186497482042c82de
MD5 41b7fd010874ce286c78761f5103749b
BLAKE2b-256 5b8607d8523730f71cf55eea9d294bfa5f9a54de219bf3e7df7136d18e76e6d1

See more details on using hashes here.

File details

Details for the file jupyterlab_doc_reader_extension-1.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for jupyterlab_doc_reader_extension-1.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 94f6924ab9f264bc106df2322707bbca2d8e66e9541017921b6e9770b21f89f9
MD5 d4506ccce5509ffe4e119d6bf0ef3b4f
BLAKE2b-256 784dae40d12f6d30020e41def994ed69a51c91b3dbe4796a246be347bcad2e45

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page