Skip to main content

Expansion to the unstructured package, adding support for image extraction.

Project description

Unstructured Expanded

The unstructured_expanded library is a wrapper around the unstructured open source library to add image-extraction capabilities to the API.

Its only purpose is to provide a more complete API for the unstructured library, since the library maintainers of the open source project have chosen to lock image extraction for office documents behind a paywall.

Quick-Start

This library is meant to be used in conjunction with the unstructured library.

Versions of this library are equivalent to the unstructured library version they are based on.

# Install the variant of unstructured with everything you need support for
pip install unstructured["all-docs"]

# Install the unstructured_expanded library on top of it
pip install unstructured_expanded

License

See the licensing information in the LICENSE file.

Citation

If you use this library in your research, please include a citation:

@misc{unstructured_expanded,
  title={Unstructured_expanded: A Python Library for Extracting Text and Images from Documents using the unstructured API.},
  author={Kogan, Isaac},
  year={2024},
  url={https://github.com/isaackogan/unstructured_expanded}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unstructured_expanded-0.16.11.post1.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unstructured_expanded-0.16.11.post1-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file unstructured_expanded-0.16.11.post1.tar.gz.

File metadata

File hashes

Hashes for unstructured_expanded-0.16.11.post1.tar.gz
Algorithm Hash digest
SHA256 c87e026c12fa7253a859d5c267d88d6a1869bd49846bdf09c91f7911bf8c2f14
MD5 d7a8064ceab99f7e05e9d7b6b11d1463
BLAKE2b-256 e34d4819f06aef28b798764c2f9fa52d86f95654dc2659e7f66fd5a950ffa2e1

See more details on using hashes here.

File details

Details for the file unstructured_expanded-0.16.11.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for unstructured_expanded-0.16.11.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 6430f323db73996cfc7e499535926897df69ff10d1af32fe757d530fb5cc84cb
MD5 d871258f69a1c8fac2e61975dcdaeb7d
BLAKE2b-256 6c52f894bd09cd64a0e2b45ce7ce4eaad0180cf00aebfa30b97e648a6bb23883

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page