Skip to main content

Accessing Bluebeam Revu PDF Data

Project description

pymkup

pymkup is a Python library for viewing markups lists and property data in PDFs created by Bluebeam Revu.

About

This is a reverse-engineered unofficial API for accessing data generated in Bluebeam Revu authored PDFs. Once a PDF is loaded, it can be scraped for some information. This is in very early development, and is being developed independently.

Installation

Use the package manager pip to install pymkup.

pip install pymkup

Usage

from pymkup import pymkup
x = pymkup("link to your pdf")

x.check_BB() # Checks if the document was authored by Revu
x.get_page_labels() # Returns page labels/"Page X" formats
x.markup_space() # A list of all spaces in a given markup.
x.get_columns() # Returns master column/property fields list on all annotations
x.spaces_hierarchy() # Generates a spaces tree.
x.spaces_hierarchy(output="dictionary") # Generates a spaces dictionary three levels deep.
x.spaces_hierarchy(output="hierarchy") # Generates a spaces hierarchy for use in the columns list.
x.csv_export() # Exports a CSV file with columns in default order.

CSV export with custom columns example

First, you should identify the columns that are accessible in your file:

x.get_columns().values()

Second, you should review the extended columns here that can also be added:

['Space', 'Page Number', 'Page Label']

Lastly, you can build the custom columns that you want in your CSV:

columns = ['Subject', 'Label', 'Date', 'PK', 'Space']
x.csv_export(column_list=columns)

Example output of spaces tree ("test4.pdf")

test4
├── A101
│   ├── Level 1
│   │   └── Area A
│   │       ├── Room 101
│   │       ├── Room 102
│   │       └── Room 103
│   └── Sub Level
└── Page 2
    └── Level 1
        └── Area B
            ├── Room 151
            │   └── Sub-Room
            └── Room 152

Requirements

  • pdfrw is a library for scraping PDF data in Python and doing some other manipulaton. As of 2021, the author has not been updating it or allowing pull requests. This may change, but the library is still very functional.
  • treelib is a library to create ASCII hierarchy trees in the spaces_tree() function.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymkup-0.2.tar.gz (7.2 kB view details)

Uploaded Source

File details

Details for the file pymkup-0.2.tar.gz.

File metadata

  • Download URL: pymkup-0.2.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.2

File hashes

Hashes for pymkup-0.2.tar.gz
Algorithm Hash digest
SHA256 ace212267c3c8807f23415508d2d7933e62df940259add2127950eac223d7e68
MD5 a9ebcb844194f7c3437833f5565c8eff
BLAKE2b-256 b17b7f6db84b7a1eaabf1e26aef30603af53a5bad276f41f1d091d1cdbd03944

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page