Document conversion library — Markdown, HTML, CSV, and JSON transformations with zero heavyweight dependencies.
Project description
peasy-document
Pure Python document conversion library for Markdown, HTML, CSV, and JSON transformations. Convert between 6 document formats with 10 conversion functions, a CLI, and frozen dataclass results -- all with only one lightweight dependency (markdown). Built for developers who need fast, reliable document format conversions without heavyweight office-suite libraries.
Part of the Peasy Tools developer tools ecosystem.
Table of Contents
- Install
- Quick Start
- What You Can Do
- Command-Line Interface
- API Reference
- Peasy Developer Tools
- License
Install
# Core library (only markdown dependency)
pip install peasy-document
# With CLI support
pip install "peasy-document[cli]"
# Everything
pip install "peasy-document[all]"
Quick Start
from peasy_document import markdown_to_html, csv_to_json, html_to_text
# Convert Markdown to HTML with tables, code highlighting, and TOC support
result = markdown_to_html("# Hello World\n\nThis is **bold** text.")
print(result.content)
# <h1>Hello World</h1>
# <p>This is <strong>bold</strong> text.</p>
# Convert CSV data to JSON array of objects
result = csv_to_json("name,age\nAlice,30\nBob,25")
print(result.content)
# [{"name": "Alice", "age": "30"}, {"name": "Bob", "age": "25"}]
# Strip HTML to plain text
result = html_to_text("<h1>Title</h1><p>Hello & welcome.</p>")
print(result.content)
# Title
# Hello & welcome.
All functions return frozen dataclasses with conversion metadata:
result = markdown_to_html("# Hello")
print(result.source_format) # "markdown"
print(result.target_format) # "html"
print(result.source_size) # 7 (bytes)
print(result.target_size) # 18 (bytes)
What You Can Do
Markdown Conversion
Convert Markdown to HTML using the battle-tested markdown library with sensible defaults. Supports tables, fenced code blocks, syntax highlighting, and table of contents generation out of the box.
from peasy_document import markdown_to_html
# Default extensions: tables, fenced_code, codehilite, toc
result = markdown_to_html("""
# API Documentation
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | /users | List users |
| POST | /users | Create user |
```python
import requests
response = requests.get("/users")
\```
""")
# Custom extensions
result = markdown_to_html("content", extensions=["tables", "toc"])
Accepts strings, bytes, or file paths:
from pathlib import Path
# Read from file
result = markdown_to_html(Path("README.md"))
# Process raw bytes
result = markdown_to_html(b"# Binary input works too")
HTML Processing
Extract plain text from HTML documents, stripping all tags and decoding HTML entities. Uses Python's stdlib html.parser -- no external dependencies needed.
from peasy_document import html_to_text, html_to_markdown
# Strip HTML to plain text
result = html_to_text("""
<html>
<head><title>Page</title></head>
<body>
<h1>Welcome</h1>
<p>This is a <strong>formatted</strong> document with & entities.</p>
<script>alert('ignored')</script>
</body>
</html>
""")
print(result.content)
# Welcome
# This is a formatted document with & entities.
# Convert HTML to Markdown (handles p, h1-h6, a, strong, em, lists, code, pre, img)
result = html_to_markdown("""
<h1>Document Title</h1>
<p>Visit <a href="https://example.com">our site</a> for <strong>more info</strong>.</p>
<ul>
<li>First item</li>
<li>Second item</li>
</ul>
""")
print(result.content)
# # Document Title
# Visit [our site](https://example.com) for **more info**.
# - First item
# - Second item
Convert plain text to HTML paragraphs:
from peasy_document import text_to_html
# Wraps paragraphs in <p> tags, single newlines become <br>
result = text_to_html("First paragraph.\n\nSecond paragraph.\nWith a line break.")
print(result.content)
# <p>First paragraph.</p>
# <p>Second paragraph.<br>With a line break.</p>
CSV and JSON Conversion
Transform between CSV and JSON formats using Python's stdlib csv and json modules. Supports custom delimiters, roundtrip conversion, and handles inconsistent keys gracefully.
from peasy_document import csv_to_json, json_to_csv
# CSV to JSON array of objects
result = csv_to_json("name,role,team\nAlice,Engineer,Backend\nBob,Designer,Frontend")
print(result.content)
# [
# {"name": "Alice", "role": "Engineer", "team": "Backend"},
# {"name": "Bob", "role": "Designer", "team": "Frontend"}
# ]
# JSON back to CSV
result = json_to_csv(result.content)
print(result.content)
# name,role,team
# Alice,Engineer,Backend
# Bob,Designer,Frontend
# Tab-separated values
result = csv_to_json("name\tage\nAlice\t30", delimiter="\t")
Convert JSON to YAML-like format without any PyYAML dependency:
from peasy_document import json_to_yaml
result = json_to_yaml('{"server": {"host": "localhost", "port": 8080}, "debug": true}')
print(result.content)
# server:
# host: localhost
# port: 8080
# debug: true
Table Formatting
Parse CSV into structured table data, or render it directly as Markdown or HTML tables.
from peasy_document import csv_to_table, csv_to_markdown, csv_to_html
# Parse into structured TableData
table = csv_to_table("Name,Age,City\nAlice,30,NYC\nBob,25,LA")
print(table.headers) # ['Name', 'Age', 'City']
print(table.row_count) # 2
print(table.column_count) # 3
print(table.rows[0]) # ['Alice', '30', 'NYC']
# Render as Markdown table with proper alignment
result = csv_to_markdown("Name,Age,City\nAlice,30,NYC\nBob,25,LA")
print(result.content)
# | Name | Age | City |
# | ----- | --- | ---- |
# | Alice | 30 | NYC |
# | Bob | 25 | LA |
# Render as HTML table with proper thead/tbody structure
result = csv_to_html("Name,Age\nAlice,30")
print(result.content)
# <table>
# <thead>
# <tr>
# <th>Name</th>
# <th>Age</th>
# </tr>
# </thead>
# <tbody>
# <tr>
# <td>Alice</td>
# <td>30</td>
# </tr>
# </tbody>
# </table>
Command-Line Interface
Install with CLI support: pip install "peasy-document[cli]"
# Convert Markdown to HTML
peasy-document md-to-html README.md -o output.html
# Strip HTML to plain text
peasy-document html-to-text page.html
# Convert CSV to JSON
peasy-document csv-to-json data.csv -o data.json
# Convert JSON array to CSV
peasy-document json-to-csv records.json -o records.csv
# CSV to Markdown table
peasy-document csv-to-markdown data.csv
# HTML to Markdown
peasy-document html-to-markdown page.html -o page.md
All commands write to stdout by default. Use -o / --output to write to a file.
API Reference
Conversion Functions
| Function | Input | Output | Dependencies |
|---|---|---|---|
markdown_to_html(source, *, extensions=None) |
Markdown | HTML | markdown library |
html_to_text(source) |
HTML | Plain text | stdlib only |
html_to_markdown(source) |
HTML | Markdown | stdlib only |
text_to_html(source) |
Plain text | HTML | stdlib only |
csv_to_json(source, *, delimiter=",") |
CSV | JSON | stdlib only |
json_to_csv(source) |
JSON | CSV | stdlib only |
csv_to_table(source, *, delimiter=",") |
CSV | TableData |
stdlib only |
csv_to_markdown(source, *, delimiter=",") |
CSV | Markdown table | stdlib only |
csv_to_html(source, *, delimiter=",") |
CSV | HTML table | stdlib only |
json_to_yaml(source) |
JSON | YAML | stdlib only |
All functions accept TextInput (str, bytes, or Path) and return ConversionResult or TableData.
Types
| Type | Fields |
|---|---|
ConversionResult |
content, source_format, target_format, source_size, target_size |
TableData |
headers, rows, row_count, column_count |
Peasy Developer Tools
| Package | PyPI | Description |
|---|---|---|
| peasy-document | PyPI | Document conversion -- Markdown, HTML, CSV, JSON |
| peasy-pdf | PyPI | PDF manipulation and conversion |
| peasy-image | PyPI | Image format conversion and optimization |
| peasytext | PyPI | Text analysis and transformation |
| peasy-css | PyPI | CSS minification and processing |
| peasy-compress | PyPI | File compression utilities |
| peasy-convert | PyPI | Unified CLI for all Peasy tools |
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file peasy_document-0.1.0.tar.gz.
File metadata
- Download URL: peasy_document-0.1.0.tar.gz
- Upload date:
- Size: 35.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbf320e621d9e53d75a643a6149948f29c00804b3251be4d4517a6a3f37b4985
|
|
| MD5 |
06d54e187a9e5722ce9a143572355ffe
|
|
| BLAKE2b-256 |
f8dc4bfe50a9e4454b5529f298ae21c99a0e7b9831e6e927ba6d9d15c4719f58
|
File details
Details for the file peasy_document-0.1.0-py3-none-any.whl.
File metadata
- Download URL: peasy_document-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a56223d5f45fdd274dbe4ccc6c024277e47de8959838a45f63d2792c5125d54
|
|
| MD5 |
24ce0e8203a26375ee005d9bf218d0ec
|
|
| BLAKE2b-256 |
43fc3b90301641482205af6d8336ac8ed7285fd2c16860c272f5a4184cd76fff
|