Skip to main content

Everything to Markdown.

Project description

wisup_e2m Logo

E2M Repo E2M Version Python Version

E2M: Everything to Markdown

Everything to Markdown

E2M is a versatile tool that converts a wide range of file types into Markdown format.

Supported File Types

  • doc
  • docx
  • epub
  • html
  • htm
  • url
  • pdf
  • pptx
  • mp3
  • m4a

Installation

To install E2M, use pip:

pip install wisup_e2m

Usage

Here's a simple example demonstrating how to use E2M:

from wisup_e2m import E2MParser

# Initialize the parser with your configuration file
ep = E2MParser.from_config("config.yaml")

# Parse the desired file
data = ep.parse(file_name="/path/to/file.pdf")

# Print the parsed data as a dictionary
print(data.to_dict())

Config Template

parsers:
  doc_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  docx_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  epub_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  html_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  url_parser:
    engine: "jina"
    langs: ["en", "zh"]
  pdf_parser:
    engine: "marker"
    langs: ["en", "zh"]
  pptx_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  voice_parser:
    # option 1: use openai whisper api
    # engine: "openai_whisper_api"
    # api_base: "https://api.openai.com/v1"
    # api_key: "your_api_key"
    # model: "whisper"

    # option 2: use local whisper model
    engine: "openai_whisper_local"
    model: "large" # available models: https://github.com/openai/whisper#available-models-and-languages

converter:
  text_converter:
    engine: "litellm"
    model: "deepseek/deepseek-chat"
    api_key: "your_api_key"
    # base_url: ""
  image_converter:
    engine: "litellm"
    model: "gpt-4o-mini"
    api_key: "your_api_key"
    # base_url: ""

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any questions or inquiries, please open an issue on GitHub or contact us at team@wisup.ai.

🌟Contributing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wisup_e2m-0.1.33.tar.gz (34.4 kB view details)

Uploaded Source

Built Distribution

wisup_e2m-0.1.33-py3-none-any.whl (54.8 kB view details)

Uploaded Python 3

File details

Details for the file wisup_e2m-0.1.33.tar.gz.

File metadata

  • Download URL: wisup_e2m-0.1.33.tar.gz
  • Upload date:
  • Size: 34.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/23.6.0

File hashes

Hashes for wisup_e2m-0.1.33.tar.gz
Algorithm Hash digest
SHA256 6efca4da95fc4138e662ea06c1198483c6486dd9770b4f5be56da163cb306f1e
MD5 146d155147b35ddb116a375efa1c12a3
BLAKE2b-256 82b732de9ec0ff7520226d677a660b49def1825ab6602974dcc4d10482ba8955

See more details on using hashes here.

File details

Details for the file wisup_e2m-0.1.33-py3-none-any.whl.

File metadata

  • Download URL: wisup_e2m-0.1.33-py3-none-any.whl
  • Upload date:
  • Size: 54.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/23.6.0

File hashes

Hashes for wisup_e2m-0.1.33-py3-none-any.whl
Algorithm Hash digest
SHA256 0efa55e066359a924227deb1c3f0c716f185c2cc784798002794a64f1416855a
MD5 7e185538e0df53040458dd6236d92948
BLAKE2b-256 e4cbe14fbe67195edc70fce7bd356a5957d8289232be16abee5d4574513a277b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page