A Python library for intelligently converting text into Markdown.
Project description
text2markdown 📝
text2markdown is a Python library for intelligently converting plain text into Markdown.
text2markdown is powered by the Isaacus enrichment API, which converts unstructured documents into rich, highly structured knowledge graphs that can easily be transformed into Markdown.
In all, text2markdown is capable of:
- Identifying and formatting headings.
- Segmenting text into nested sections.
- Hyperlinking cross-references within texts to other sections.
- Italicizing cited documents.
- Detecting and formatting block quotations.
- Striking through junk text.
Setup 📦
text2markdown can be installed with pip (or uv):
pip install text2markdown
An Isaacus API key is also required to use this library.
Usage 👩💻
The code snippet below demonstrates how you might use text2markdown() to intelligently convert a short document into Markdown.
from text2markdown import text2markdown
text = """\
The Smallest Document In The World
This is a generic document.
Section 1 - Background
One upon a time, there was a mayor who said:
We love Markdown so much that everyone should and must use it for everything.
Section 2 - Problem
The mayor's directive, as stated in Section 1, was sadly too difficult to enforce."""
output = text2markdown(text)
print(output)
The output should look something like this:
# The Smallest Document In The World
This is a generic document.
## <a id="seg-1"></a>Section 1 - Background
One upon a time, there was a mayor who said:
> We love Markdown so much that everyone should and must use it for everything.
## Section 2 - Problem
The mayor's directive, as stated in [Section 1](#seg-1), was sadly too difficult to enforce.
An asynchronous version of text2markdown() is also available, supporting all of the same features and arguments as its synchronous equivalent. It can be used like so:
from text2markdown import text2markdown_async
output = await text2markdown_async(text)
print(output)
All of the various capabilities of text2markdown can be toggled on or off using optional Boolean parameters, as shown below:
from text2markdown import text2markdown
from isaacus import Isaacus
output = text2markdown(
text,
link_xrefs=True,
strike_junk=True,
block_quotes=True,
italicize_refs=True,
escape_lists=True,
enrichment_model="kanon-2-enricher",
isaacus_client=Isaacus(),
)
print(output)
License 📜
This library is licensed under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file text2markdown-0.1.1.tar.gz.
File metadata
- Download URL: text2markdown-0.1.1.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01422c89602b8cb26fb311aee43262b1ce49d52d7e4fa1a79dc5e2a0a6d9a8bf
|
|
| MD5 |
540556e443ea46fefe56b6b9f46c2a0f
|
|
| BLAKE2b-256 |
b0dec1dcf11c69684490fda79c97f1c2d59c4883ea8a497c7b543f4faff625e8
|
Provenance
The following attestation bundles were made for text2markdown-0.1.1.tar.gz:
Publisher:
python-publish.yml on isaacus-dev/text2markdown
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
text2markdown-0.1.1.tar.gz -
Subject digest:
01422c89602b8cb26fb311aee43262b1ce49d52d7e4fa1a79dc5e2a0a6d9a8bf - Sigstore transparency entry: 1197276309
- Sigstore integration time:
-
Permalink:
isaacus-dev/text2markdown@56aa9cb37d5c6c98bc81c9b9cca709418598d0d2 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/isaacus-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@56aa9cb37d5c6c98bc81c9b9cca709418598d0d2 -
Trigger Event:
release
-
Statement type:
File details
Details for the file text2markdown-0.1.1-py3-none-any.whl.
File metadata
- Download URL: text2markdown-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78b578ad1af4888f31df98ea6a7347f49561718da94d1530b33b076fad3a6f6d
|
|
| MD5 |
54f3f1b4db03933342f06dedd4c7485c
|
|
| BLAKE2b-256 |
91b2d7add86a6686a753fa800004f5f0855c3d317e811dd845f6e22ae9520252
|
Provenance
The following attestation bundles were made for text2markdown-0.1.1-py3-none-any.whl:
Publisher:
python-publish.yml on isaacus-dev/text2markdown
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
text2markdown-0.1.1-py3-none-any.whl -
Subject digest:
78b578ad1af4888f31df98ea6a7347f49561718da94d1530b33b076fad3a6f6d - Sigstore transparency entry: 1197276341
- Sigstore integration time:
-
Permalink:
isaacus-dev/text2markdown@56aa9cb37d5c6c98bc81c9b9cca709418598d0d2 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/isaacus-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@56aa9cb37d5c6c98bc81c9b9cca709418598d0d2 -
Trigger Event:
release
-
Statement type: