Skip to main content

Expansion to the unstructured package, adding support for image extraction.

Project description

Unstructured Expanded

The unstructured_expanded library is a wrapper around the unstructured open source library to add image-extraction capabilities to the API.

Its only purpose is to provide a more complete API for the unstructured library, since the library maintainers of the open source project have chosen to lock image extraction for office documents behind a paywall.

Quick-Start

This library is meant to be used in conjunction with the unstructured library.

Versions of this library are equivalent to the unstructured library version they are based on.

# Install the variant of unstructured with everything you need support for
pip install unstructured["all-docs"]

# Install the unstructured_expanded library on top of it
pip install unstructured_expanded

License

See the licensing information in the LICENSE file.

Citation

If you use this library in your research, please include a citation:

@misc{unstructured_expanded,
  title={Unstructured_expanded: A Python Library for Extracting Text and Images from Documents using the unstructured API.},
  author={Kogan, Isaac},
  year={2024},
  url={https://github.com/isaackogan/unstructured_expanded}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unstructured_expanded-0.16.5.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unstructured_expanded-0.16.5-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file unstructured_expanded-0.16.5.tar.gz.

File metadata

  • Download URL: unstructured_expanded-0.16.5.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for unstructured_expanded-0.16.5.tar.gz
Algorithm Hash digest
SHA256 39d2e910202fefeeb569c293d81dc09d47f55b651aea1cfb1f8bae4a0e812bf9
MD5 0060afc5ca31b998c00956199f5129cb
BLAKE2b-256 44e75e9983857302742374d6510950966bb300cb98cedb427f76da7a959b5244

See more details on using hashes here.

File details

Details for the file unstructured_expanded-0.16.5-py3-none-any.whl.

File metadata

File hashes

Hashes for unstructured_expanded-0.16.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d311b97d77a7837de8e97c310d381a00c7d76bfb75331ec3431e0120c5174f10
MD5 2b9339244c81c382d70eea528ba9f235
BLAKE2b-256 d5df6e3168a580932abf1ca87fe6888c651dfccb4b8f432e37cb3a2b51916549

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page