A basic document parsing and loading utility.
Project description
A basic document parsing and loading utility.
Currently a placeholder for when this project is ready in the near future.
The docp project is a CPython library for extracting text from binary documents (e.g. PDF, DOCX, etc.) into Python objects, which can be used across various applications, ranging from simple plain-text extraction to loading the text into a Chroma database for LLM use.
Installation
Coming soon ...
Toolset
Coming soon ...
Using the Library
Coming soon ...
Additional Information
Coming soon ...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docp-0.1.0b1.tar.gz.
File metadata
- Download URL: docp-0.1.0b1.tar.gz
- Upload date:
- Size: 44.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d4f55a43e76b4cb6c2d2fb1f1a29276ea464f89455b601ede8e0d5d30178618
|
|
| MD5 |
35071b7ca8b4b3ede57a64bbfb93bdf4
|
|
| BLAKE2b-256 |
3d23cdfaaf28ceb6e32302fb8809893edbcb250b9ef04a70ae7f99ffa2d58536
|
File details
Details for the file docp-0.1.0b1-py3-none-any.whl.
File metadata
- Download URL: docp-0.1.0b1-py3-none-any.whl
- Upload date:
- Size: 37.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bb9233ffbb13c945513ff14ea26b12d76d385c3099d440bae91d3f0cecd57fb
|
|
| MD5 |
39b87b25b5a859e418d71feaead5ce0d
|
|
| BLAKE2b-256 |
6a349cd863b02da2eea437989c4cadd279f3798a323ba452df65071d67741828
|