Skip to main content

Streamlit component for PDF visualisation and manipulation

Project description

License PyPI version Build Coverage Status

streamlit-pdf-viewer

Streamlit component that allows the visualisation and enrichment of PDF documents Tested on Chrome and Firefox. You can see an application in action here.

Work in progress

We are early in the development, and we appreciate new contributors.

If with version 0.0.8 the PDF is not shown, please use 0.0.7.

Getting started

pip install streamlit-pdf-viewer

In your streamlit application, you can use it as:

import streamlit as st
from streamlit_pdf_viewer import pdf_viewer

pdf_viewer("str, path or bytes")

Options

Params

In the following table the list of parameters that can be provided to the pdf_viewer function:

name description
input The source of the PDF file. Accepts a file path, URL, or binary data.
width Width of the PDF viewer in pixels. It defaults to 700 pixels.
height Height of the PDF viewer in pixels. If not provided, the viewer shows the whole content.
annotations A list of annotations to be overlaid on the PDF. Format is described here.
pages_vertical_spacing The vertical space (in pixels) between each page of the PDF. Defaults to 2 pixels.
annotation_outline_size Size of the outline around each annotation in pixels. Defaults to 1 pixel.
rendering Type of rendering: unwrap (default), legacy_iframe, or legacy_embed. The default value, unwrap shows the PDF document using pdf.js, and supports the visualisation of annotations. Other values are legacy_iframe and legacy_embed which use the legacy approach of injecting the document into an <embed> or <iframe>. These methods enable the default pdf viewer of Firefox/Chrome/Edge that contains additional features we are still working to implement in this component. NOTE: Annotations are ignored for both 'legacy_iframe' and 'legacy_embed'.
pages_to_render Filter the rendering to a specific set of pages. By default all pages are rendered.

Annotation format

The annotations format has been derived from the Grobid's coordinate formats, which are described as a list of "bounding boxes". The annotations are expressed as a dictionary of six elements, the page, x and y indicate the top left point. The color can be expressed following the html CSS convention.

Here an example:

[
   {
      "page": 1,
      "x": 220,
      "y": 155,
      "height": 22,
      "width": 65,
      "color": "red"
   },
[...]

The example shown in our screenshot can be found here.

Developers notes

Environment

  • Python >= 3.8
  • Node.js >= 16
  • Streamlit >= 1.28.2

Configure environment for development

First, make sure that _RELEASE = False in streamlit_pdf_viewer/__init__.py. To run the component in development mode, use the following commands:

streamlit run streamlit_pdf_viewer/__init__.py

cd frontend
npm run serve

These commands will start the Streamlit application and serve the Node.js component. Please make sure you're in the correct directory before running these commands.

Integrate into a streamlit application

  1. Build the frontend part:

    cd frontend
    export NODE_OPTIONS=--openssl-legacy-provider
    npm run build 
    
  2. Make sure that _RELEASE = True in streamlit_pdf_viewer/__init__.py.

  3. move to the streamlit_application and run

    pip install -e {path of component}
    

Release

bump-my-version bump patch | minor | major
git push
git push --tags 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamlit-pdf-viewer-0.0.9.tar.gz (2.2 MB view details)

Uploaded Source

Built Distribution

streamlit_pdf_viewer-0.0.9-py3-none-any.whl (2.2 MB view details)

Uploaded Python 3

File details

Details for the file streamlit-pdf-viewer-0.0.9.tar.gz.

File metadata

  • Download URL: streamlit-pdf-viewer-0.0.9.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for streamlit-pdf-viewer-0.0.9.tar.gz
Algorithm Hash digest
SHA256 15fb3398d7e2f65a4ad2963b3b1af9a373f4dc2941a3033b5b94e380c67f023f
MD5 017f0b709c81802f8006d9b5dc4b9e21
BLAKE2b-256 650a0b92649fe87cf8c9337dee54148ec9224ec50343f8f5d2604fb165a945e1

See more details on using hashes here.

File details

Details for the file streamlit_pdf_viewer-0.0.9-py3-none-any.whl.

File metadata

File hashes

Hashes for streamlit_pdf_viewer-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 05bf4a437a6be331f245e953a7044cf5f8b241ecbf700ec8af8c09e621e9241b
MD5 888aeee0db7104093f45655389ebb2fc
BLAKE2b-256 7f3eacc36599691b022db1fc82f77eee0e9f9cb560df9b4e1e7cc8e31b7e0817

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page