Skip to main content

Everything to Markdown.

Project description

wisup_e2m Logo

E2M Repo E2M Version Python Version

E2M: Everything to Markdown

Everything to Markdown

E2M is a versatile tool that converts a wide range of file types into Markdown format.

Supported File Types

  • doc
  • docx
  • epub
  • html
  • htm
  • url
  • pdf
  • pptx
  • mp3
  • m4a

Installation

To install E2M, use pip:

pip install wisup_e2m

Usage

Here's a simple example demonstrating how to use E2M:

from wisup_e2m import E2MParser

# Initialize the parser with your configuration file
ep = E2MParser.from_config("config.yaml")

# Parse the desired file
data = ep.parse(file_name="/path/to/file.pdf")

# Print the parsed data as a dictionary
print(data.to_dict())

Config Template

parsers:
  doc_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  docx_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  epub_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  html_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  url_parser:
    engine: "jina"
    langs: ["en", "zh"]
  pdf_parser:
    engine: "marker"
    langs: ["en", "zh"]
  pptx_parser:
    engine: "unstructured"
    langs: ["en", "zh"]
  voice_parser:
    # option 1: use openai whisper api
    # engine: "openai_whisper_api"
    # api_base: "https://api.openai.com/v1"
    # api_key: "your_api_key"
    # model: "whisper"

    # option 2: use local whisper model
    engine: "openai_whisper_local"
    model: "large" # available models: https://github.com/openai/whisper#available-models-and-languages

converter:
  text_converter:
    engine: "litellm"
    model: "deepseek/deepseek-chat"
    api_key: "your_api_key"
    # base_url: ""
  image_converter:
    engine: "litellm"
    model: "gpt-4o-mini"
    api_key: "your_api_key"
    # base_url: ""

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any questions or inquiries, please open an issue on GitHub or contact us at team@wisup.ai.

🌟Contributing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wisup_e2m-0.1.32.tar.gz (34.4 kB view details)

Uploaded Source

Built Distribution

wisup_e2m-0.1.32-py3-none-any.whl (54.8 kB view details)

Uploaded Python 3

File details

Details for the file wisup_e2m-0.1.32.tar.gz.

File metadata

  • Download URL: wisup_e2m-0.1.32.tar.gz
  • Upload date:
  • Size: 34.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/23.6.0

File hashes

Hashes for wisup_e2m-0.1.32.tar.gz
Algorithm Hash digest
SHA256 f15cac6c7168e8a4e2c4e66264aae68fb93ddfb6e609e7c4d136f802060abed2
MD5 70ed04a66b373cdc70ffbf2ea74686ba
BLAKE2b-256 166e04f8794643d7f8fd069cb65156d873e3fe0f7457e1c2f00ae660f1d4bdf8

See more details on using hashes here.

File details

Details for the file wisup_e2m-0.1.32-py3-none-any.whl.

File metadata

  • Download URL: wisup_e2m-0.1.32-py3-none-any.whl
  • Upload date:
  • Size: 54.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/23.6.0

File hashes

Hashes for wisup_e2m-0.1.32-py3-none-any.whl
Algorithm Hash digest
SHA256 cbfd3cf185183c62c3211eff03ee33fd41220928eb14ca56d09ce47fe30f7635
MD5 cb3700de9ac38281f807eb0593f79350
BLAKE2b-256 67b22363db2ba2f3b070409afbca93dcbf913723082d1dba12f149e3723b49c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page