Skip to main content

Expansion to the unstructured package, adding support for image extraction.

Project description

Unstructured Expanded

The unstructured_expanded library is a wrapper around the unstructured open source library to add image-extraction capabilities to the API.

Its only purpose is to provide a more complete API for the unstructured library, since the library maintainers of the open source project have chosen to lock image extraction for office documents behind a paywall.

Quick-Start

This library is meant to be used in conjunction with the unstructured library.

Versions of this library are equivalent to the unstructured library version they are based on.

# Install the variant of unstructured with everything you need support for
pip install unstructured["all-docs"]

# Install the unstructured_expanded library on top of it
pip install unstructured_expanded

License

See the licensing information in the LICENSE file.

Citation

If you use this library in your research, please include a citation:

@misc{unstructured_expanded,
  title={Unstructured_expanded: A Python Library for Extracting Text and Images from Documents using the unstructured API.},
  author={Kogan, Isaac},
  year={2024},
  url={https://github.com/isaackogan/unstructured_expanded}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unstructured_expanded-0.17.2.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unstructured_expanded-0.17.2-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file unstructured_expanded-0.17.2.tar.gz.

File metadata

  • Download URL: unstructured_expanded-0.17.2.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for unstructured_expanded-0.17.2.tar.gz
Algorithm Hash digest
SHA256 0ee9f106d1672685261931e88984630eea00b8da4dae333fb0f3db150526f6ab
MD5 b5070150d0e58cab9e71b1bf9ed407f2
BLAKE2b-256 e556e3d4e8b3d3b3df7852f8a2a888ebd5b2d5f3241d0da8dbb63147a3045175

See more details on using hashes here.

File details

Details for the file unstructured_expanded-0.17.2-py3-none-any.whl.

File metadata

File hashes

Hashes for unstructured_expanded-0.17.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f5a8fc33458fc39829d83b805fde61a77a17ded3097c56249b04df88431658e9
MD5 b5679dcaaa5174e0fe216b1972f41c10
BLAKE2b-256 01b41821c5d17338a50de2b09db990dac287bf10f3ffae7ce27d22f47f704293

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page