Skip to main content

Jupyterlab extension that allows reading of the docx documents

Project description

JupyterLab Document Reader Extension

GitHub Actions npm version PyPI version

A JupyterLab extension that allows viewing Microsoft Word documents (DOCX, DOC) and Rich Text Format (RTF) files directly in JupyterLab. The extension automatically converts documents to PDF on-the-fly for seamless viewing without creating persistent files.

Features

  • View DOCX, DOC, and RTF files directly in JupyterLab
  • Automatic conversion to PDF for display (no temporary files created in your workspace)
  • Native PDF rendering in the browser
  • Unicode support with automatic font detection for international characters (Polish, German, French, etc.)
  • Read-only mode to prevent accidental modifications
  • Clean, integrated interface matching JupyterLab's design

Architecture

This extension consists of:

  • Python server extension: Handles document-to-PDF conversion using pure Python libraries (python-docx + reportlab)
  • TypeScript frontend extension: Provides the document viewer widget and file type registration

Requirements

  • JupyterLab >= 4.0.0
  • Python >= 3.9
  • No external system dependencies required (pure Python solution)

Install

Simply install the extension with pip:

pip install jupyterlab_doc_reader_extension

All required Python dependencies (python-docx, reportlab) will be installed automatically.

Usage

Once installed, simply click on any .docx, .doc, or .rtf file in the JupyterLab file browser. The extension will automatically:

  1. Convert the document to PDF on the server
  2. Stream the PDF to your browser
  3. Display it in a dedicated viewer tab

No temporary files are created in your workspace - the conversion happens in memory on the server side.

Uninstall

To remove the extension, execute:

pip uninstall jupyterlab_doc_reader_extension

Troubleshoot

If you are seeing the frontend extension, but it is not working, check that the server extension is enabled:

jupyter server extension list

If the server extension is installed and enabled, but you are not seeing the frontend extension, check the frontend extension is installed:

jupyter labextension list

Contributing

Development install

Note: You will need NodeJS to build the extension package.

The jlpm command is JupyterLab's pinned version of yarn that is installed with JupyterLab. You may use yarn or npm in lieu of jlpm below.

# Clone the repo to your local environment
# Change directory to the jupyterlab_doc_reader_extension directory
# Install package in development mode
pip install -e ".[test]"
# Link your development version of the extension with JupyterLab
jupyter labextension develop . --overwrite
# Server extension must be manually installed in develop mode
jupyter server extension enable jupyterlab_doc_reader_extension
# Rebuild extension Typescript source after making changes
jlpm build

You can watch the source directory and run JupyterLab at the same time in different terminals to watch for changes in the extension's source and automatically rebuild the extension.

# Watch the source directory in one terminal, automatically rebuilding when needed
jlpm watch
# Run JupyterLab in another terminal
jupyter lab

With the watch command running, every saved change will immediately be built locally and available in your running JupyterLab. Refresh JupyterLab to load the change in your browser (you may need to wait several seconds for the extension to be rebuilt).

By default, the jlpm build command generates the source maps for this extension to make it easier to debug using the browser dev tools. To also generate source maps for the JupyterLab core extensions, you can run the following command:

jupyter lab build --minimize=False

Development uninstall

# Server extension must be manually disabled in develop mode
jupyter server extension disable jupyterlab_doc_reader_extension
pip uninstall jupyterlab_doc_reader_extension

In development mode, you will also need to remove the symlink created by jupyter labextension develop command. To find its location, you can run jupyter labextension list to figure out where the labextensions folder is located. Then you can remove the symlink named jupyterlab_doc_reader_extension within that folder.

Testing the extension

Server tests

This extension is using Pytest for Python code testing.

Install test dependencies (needed only once):

pip install -e ".[test]"
# Each time you install the Python package, you need to restore the front-end extension link
jupyter labextension develop . --overwrite

To execute them, run:

pytest -vv -r ap --cov jupyterlab_doc_reader_extension

Frontend tests

This extension is using Jest for JavaScript code testing.

To execute them, execute:

jlpm
jlpm test

Integration tests

This extension uses Playwright for the integration tests (aka user level tests). More precisely, the JupyterLab helper Galata is used to handle testing the extension in JupyterLab.

More information are provided within the ui-tests README.

Packaging the extension

See RELEASE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jupyterlab_doc_reader_extension-1.0.1.tar.gz (537.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jupyterlab_doc_reader_extension-1.0.1-py3-none-any.whl (24.7 kB view details)

Uploaded Python 3

File details

Details for the file jupyterlab_doc_reader_extension-1.0.1.tar.gz.

File metadata

File hashes

Hashes for jupyterlab_doc_reader_extension-1.0.1.tar.gz
Algorithm Hash digest
SHA256 41b3cdff9d9a3483ba8233e08aed7530227280f531951fd47d7eaf5cf86b3474
MD5 2ac09ed4f5a5576b80d4d2fa5886b5a1
BLAKE2b-256 9ec457cf7210eecdf3d02ce4187fcf121ead06f1d1c074249109b7b37e653d98

See more details on using hashes here.

File details

Details for the file jupyterlab_doc_reader_extension-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for jupyterlab_doc_reader_extension-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f6d60ab5bd837a01ff8f6e7c700cd17d1879a193ad007f7d2432d851ffe393e8
MD5 94faf251182d77abe213ad167271289f
BLAKE2b-256 463d46338a0cf3b00e5f9485d6faa41ebae10f6f4821279ea8d00836f22518d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page