Everything to Markdown.
Project description
E2M: Everything to Markdown
Everything to Markdown
E2M is a versatile tool that converts a wide range of file types into Markdown format.
Supported File Types
- doc
- docx
- epub
- html
- htm
- url
- pptx
- mp3
- m4a
Installation
To install E2M, use pip:
pip install wisup_e2m
Usage
Here's a simple example demonstrating how to use E2M:
from wisup_e2m import E2MParser
# Initialize the parser with your configuration file
ep = E2MParser.from_config("config.yaml")
# Parse the desired file
data = ep.parse(file_name="/path/to/file.pdf")
# Print the parsed data as a dictionary
print(data.to_dict())
Config Template
parsers:
doc_parser:
engine: "unstructured"
langs: ["en", "zh"]
docx_parser:
engine: "unstructured"
langs: ["en", "zh"]
epub_parser:
engine: "unstructured"
langs: ["en", "zh"]
html_parser:
engine: "unstructured"
langs: ["en", "zh"]
url_parser:
engine: "jina"
langs: ["en", "zh"]
pdf_parser:
engine: "marker"
langs: ["en", "zh"]
pptx_parser:
engine: "unstructured"
langs: ["en", "zh"]
voice_parser:
# option 1: use openai whisper api
# engine: "openai_whisper_api"
# api_base: "https://api.openai.com/v1"
# api_key: "your_api_key"
# model: "whisper"
# option 2: use local whisper model
engine: "openai_whisper_local"
model: "large" # available models: https://github.com/openai/whisper#available-models-and-languages
converter:
text_converter:
engine: "litellm"
model: "deepseek/deepseek-chat"
api_key: "your_api_key"
# base_url: ""
image_converter:
engine: "litellm"
model: "gpt-4o-mini"
api_key: "your_api_key"
# base_url: ""
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contact
For any questions or inquiries, please open an issue on GitHub or contact us at team@wisup.ai.
🌟Contributing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wisup_e2m-0.1.33.tar.gz
(34.4 kB
view details)
Built Distribution
File details
Details for the file wisup_e2m-0.1.33.tar.gz
.
File metadata
- Download URL: wisup_e2m-0.1.33.tar.gz
- Upload date:
- Size: 34.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6efca4da95fc4138e662ea06c1198483c6486dd9770b4f5be56da163cb306f1e |
|
MD5 | 146d155147b35ddb116a375efa1c12a3 |
|
BLAKE2b-256 | 82b732de9ec0ff7520226d677a660b49def1825ab6602974dcc4d10482ba8955 |
File details
Details for the file wisup_e2m-0.1.33-py3-none-any.whl
.
File metadata
- Download URL: wisup_e2m-0.1.33-py3-none-any.whl
- Upload date:
- Size: 54.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0efa55e066359a924227deb1c3f0c716f185c2cc784798002794a64f1416855a |
|
MD5 | 7e185538e0df53040458dd6236d92948 |
|
BLAKE2b-256 | e4cbe14fbe67195edc70fce7bd356a5957d8289232be16abee5d4574513a277b |