Skip to main content

Expansion to the unstructured package, adding support for image extraction.

Project description

Unstructured Expanded

The unstructured_expanded library is a wrapper around the unstructured open source library to add image-extraction capabilities to the API.

Its only purpose is to provide a more complete API for the unstructured library, since the library maintainers of the open source project have chosen to lock image extraction for office documents behind a paywall.

Quick-Start

This library is meant to be used in conjunction with the unstructured library.

Versions of this library are equivalent to the unstructured library version they are based on.

# Install the variant of unstructured with everything you need support for
pip install unstructured["all-docs"]

# Install the unstructured_expanded library on top of it
pip install unstructured_expanded

License

See the licensing information in the LICENSE file.

Citation

If you use this library in your research, please include a citation:

@misc{unstructured_expanded,
  title={Unstructured_expanded: A Python Library for Extracting Text and Images from Documents using the unstructured API.},
  author={Kogan, Isaac},
  year={2024},
  url={https://github.com/isaackogan/unstructured_expanded}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unstructured_expanded-0.16.11.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unstructured_expanded-0.16.11-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file unstructured_expanded-0.16.11.tar.gz.

File metadata

  • Download URL: unstructured_expanded-0.16.11.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for unstructured_expanded-0.16.11.tar.gz
Algorithm Hash digest
SHA256 888a92af34b9d671785e9be19033babac8a2d7446aed10f3d73a3373ed833889
MD5 83f1fe66f37de278806a45f1dad1777f
BLAKE2b-256 c4f11b2dc7535172118833746d6d728b30d50605309e811cffe8d588e6f1c448

See more details on using hashes here.

File details

Details for the file unstructured_expanded-0.16.11-py3-none-any.whl.

File metadata

File hashes

Hashes for unstructured_expanded-0.16.11-py3-none-any.whl
Algorithm Hash digest
SHA256 608583dc4bc7a67a21f15f8b229b47bbd89fbe7ad4491a84ed87475bf840165f
MD5 e2d6c8e4f839e4ee2b7877cee51d4259
BLAKE2b-256 d28013b91a98235a8edef72010202a8de391053602c0356a6ccd7d9a705fa282

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page