Skip to main content

Document conversion library — Markdown, HTML, CSV, and JSON transformations with zero heavyweight dependencies.

Project description

peasy-document

PyPI Python License: MIT

Pure Python document conversion library for Markdown, HTML, CSV, and JSON transformations. Convert between 6 document formats with 10 conversion functions, a CLI, and frozen dataclass results -- all with only one lightweight dependency (markdown). Built for developers who need fast, reliable document format conversions without heavyweight office-suite libraries.

Part of the Peasy Tools developer tools ecosystem.

Table of Contents

Install

# Core library (only markdown dependency)
pip install peasy-document

# With CLI support
pip install "peasy-document[cli]"

# Everything
pip install "peasy-document[all]"

Quick Start

from peasy_document import markdown_to_html, csv_to_json, html_to_text

# Convert Markdown to HTML with tables, code highlighting, and TOC support
result = markdown_to_html("# Hello World\n\nThis is **bold** text.")
print(result.content)
# <h1>Hello World</h1>
# <p>This is <strong>bold</strong> text.</p>

# Convert CSV data to JSON array of objects
result = csv_to_json("name,age\nAlice,30\nBob,25")
print(result.content)
# [{"name": "Alice", "age": "30"}, {"name": "Bob", "age": "25"}]

# Strip HTML to plain text
result = html_to_text("<h1>Title</h1><p>Hello &amp; welcome.</p>")
print(result.content)
# Title
# Hello & welcome.

All functions return frozen dataclasses with conversion metadata:

result = markdown_to_html("# Hello")
print(result.source_format)  # "markdown"
print(result.target_format)  # "html"
print(result.source_size)    # 7 (bytes)
print(result.target_size)    # 18 (bytes)

What You Can Do

Markdown Conversion

Convert Markdown to HTML using the battle-tested markdown library with sensible defaults. Supports tables, fenced code blocks, syntax highlighting, and table of contents generation out of the box.

from peasy_document import markdown_to_html

# Default extensions: tables, fenced_code, codehilite, toc
result = markdown_to_html("""
# API Documentation

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET    | /users   | List users  |
| POST   | /users   | Create user |

```python
import requests
response = requests.get("/users")
\```
""")

# Custom extensions
result = markdown_to_html("content", extensions=["tables", "toc"])

Accepts strings, bytes, or file paths:

from pathlib import Path

# Read from file
result = markdown_to_html(Path("README.md"))

# Process raw bytes
result = markdown_to_html(b"# Binary input works too")

HTML Processing

Extract plain text from HTML documents, stripping all tags and decoding HTML entities. Uses Python's stdlib html.parser -- no external dependencies needed.

from peasy_document import html_to_text, html_to_markdown

# Strip HTML to plain text
result = html_to_text("""
<html>
<head><title>Page</title></head>
<body>
  <h1>Welcome</h1>
  <p>This is a <strong>formatted</strong> document with &amp; entities.</p>
  <script>alert('ignored')</script>
</body>
</html>
""")
print(result.content)
# Welcome
# This is a formatted document with & entities.

# Convert HTML to Markdown (handles p, h1-h6, a, strong, em, lists, code, pre, img)
result = html_to_markdown("""
<h1>Document Title</h1>
<p>Visit <a href="https://example.com">our site</a> for <strong>more info</strong>.</p>
<ul>
  <li>First item</li>
  <li>Second item</li>
</ul>
""")
print(result.content)
# # Document Title
# Visit [our site](https://example.com) for **more info**.
# - First item
# - Second item

Convert plain text to HTML paragraphs:

from peasy_document import text_to_html

# Wraps paragraphs in <p> tags, single newlines become <br>
result = text_to_html("First paragraph.\n\nSecond paragraph.\nWith a line break.")
print(result.content)
# <p>First paragraph.</p>
# <p>Second paragraph.<br>With a line break.</p>

CSV and JSON Conversion

Transform between CSV and JSON formats using Python's stdlib csv and json modules. Supports custom delimiters, roundtrip conversion, and handles inconsistent keys gracefully.

from peasy_document import csv_to_json, json_to_csv

# CSV to JSON array of objects
result = csv_to_json("name,role,team\nAlice,Engineer,Backend\nBob,Designer,Frontend")
print(result.content)
# [
#   {"name": "Alice", "role": "Engineer", "team": "Backend"},
#   {"name": "Bob", "role": "Designer", "team": "Frontend"}
# ]

# JSON back to CSV
result = json_to_csv(result.content)
print(result.content)
# name,role,team
# Alice,Engineer,Backend
# Bob,Designer,Frontend

# Tab-separated values
result = csv_to_json("name\tage\nAlice\t30", delimiter="\t")

Convert JSON to YAML-like format without any PyYAML dependency:

from peasy_document import json_to_yaml

result = json_to_yaml('{"server": {"host": "localhost", "port": 8080}, "debug": true}')
print(result.content)
# server:
#   host: localhost
#   port: 8080
# debug: true

Table Formatting

Parse CSV into structured table data, or render it directly as Markdown or HTML tables.

from peasy_document import csv_to_table, csv_to_markdown, csv_to_html

# Parse into structured TableData
table = csv_to_table("Name,Age,City\nAlice,30,NYC\nBob,25,LA")
print(table.headers)       # ['Name', 'Age', 'City']
print(table.row_count)     # 2
print(table.column_count)  # 3
print(table.rows[0])       # ['Alice', '30', 'NYC']

# Render as Markdown table with proper alignment
result = csv_to_markdown("Name,Age,City\nAlice,30,NYC\nBob,25,LA")
print(result.content)
# | Name  | Age | City |
# | ----- | --- | ---- |
# | Alice | 30  | NYC  |
# | Bob   | 25  | LA   |

# Render as HTML table with proper thead/tbody structure
result = csv_to_html("Name,Age\nAlice,30")
print(result.content)
# <table>
#   <thead>
#     <tr>
#       <th>Name</th>
#       <th>Age</th>
#     </tr>
#   </thead>
#   <tbody>
#     <tr>
#       <td>Alice</td>
#       <td>30</td>
#     </tr>
#   </tbody>
# </table>

Command-Line Interface

Install with CLI support: pip install "peasy-document[cli]"

# Convert Markdown to HTML
peasy-document md-to-html README.md -o output.html

# Strip HTML to plain text
peasy-document html-to-text page.html

# Convert CSV to JSON
peasy-document csv-to-json data.csv -o data.json

# Convert JSON array to CSV
peasy-document json-to-csv records.json -o records.csv

# CSV to Markdown table
peasy-document csv-to-markdown data.csv

# HTML to Markdown
peasy-document html-to-markdown page.html -o page.md

All commands write to stdout by default. Use -o / --output to write to a file.

API Reference

Conversion Functions

Function Input Output Dependencies
markdown_to_html(source, *, extensions=None) Markdown HTML markdown library
html_to_text(source) HTML Plain text stdlib only
html_to_markdown(source) HTML Markdown stdlib only
text_to_html(source) Plain text HTML stdlib only
csv_to_json(source, *, delimiter=",") CSV JSON stdlib only
json_to_csv(source) JSON CSV stdlib only
csv_to_table(source, *, delimiter=",") CSV TableData stdlib only
csv_to_markdown(source, *, delimiter=",") CSV Markdown table stdlib only
csv_to_html(source, *, delimiter=",") CSV HTML table stdlib only
json_to_yaml(source) JSON YAML stdlib only

All functions accept TextInput (str, bytes, or Path) and return ConversionResult or TableData.

Types

Type Fields
ConversionResult content, source_format, target_format, source_size, target_size
TableData headers, rows, row_count, column_count

Peasy Developer Tools

Package PyPI Description
peasy-document PyPI Document conversion -- Markdown, HTML, CSV, JSON
peasy-pdf PyPI PDF manipulation and conversion
peasy-image PyPI Image format conversion and optimization
peasytext PyPI Text analysis and transformation
peasy-css PyPI CSS minification and processing
peasy-compress PyPI File compression utilities
peasy-convert PyPI Unified CLI for all Peasy tools

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peasy_document-0.1.0.tar.gz (35.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

peasy_document-0.1.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file peasy_document-0.1.0.tar.gz.

File metadata

  • Download URL: peasy_document-0.1.0.tar.gz
  • Upload date:
  • Size: 35.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for peasy_document-0.1.0.tar.gz
Algorithm Hash digest
SHA256 dbf320e621d9e53d75a643a6149948f29c00804b3251be4d4517a6a3f37b4985
MD5 06d54e187a9e5722ce9a143572355ffe
BLAKE2b-256 f8dc4bfe50a9e4454b5529f298ae21c99a0e7b9831e6e927ba6d9d15c4719f58

See more details on using hashes here.

File details

Details for the file peasy_document-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: peasy_document-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for peasy_document-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5a56223d5f45fdd274dbe4ccc6c024277e47de8959838a45f63d2792c5125d54
MD5 24ce0e8203a26375ee005d9bf218d0ec
BLAKE2b-256 43fc3b90301641482205af6d8336ac8ed7285fd2c16860c272f5a4184cd76fff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page