Skip to main content

A Python library for intelligently converting text into Markdown.

Project description

text2markdown 📝

text2markdown is a Python library for intelligently converting plain text into Markdown.

text2markdown is powered by the Isaacus enrichment API, which converts unstructured documents into rich, highly structured knowledge graphs that can easily be transformed into Markdown.

In all, text2markdown is capable of:

  • Identifying and formatting headings.
  • Segmenting text into nested sections.
  • Hyperlinking cross-references within texts to other sections.
  • Italicizing cited documents.
  • Italicizing defined terms.
  • Detecting and formatting block quotations.
  • Striking through junk text.

Setup 📦

text2markdown can be installed with pip (or uv):

pip install text2markdown

An Isaacus API key is also required to use this library.

Usage 👩‍💻

The code snippet below demonstrates how you might use text2markdown() to intelligently convert a short document into Markdown.

from text2markdown import text2markdown

text = """\
The Smallest Document In The World
This is a generic document.

Section 1 - Background
One upon a time, there was a mayor who said:
We love Markdown so much that everyone should and must use it for everything.

Section 2 - Problem
The mayor's directive, as stated in Section 1, was sadly too difficult to enforce."""

output = text2markdown(text)
print(output)

The output should look something like this:

# The Smallest Document In The World 

This is a generic document. 

## <a id="seg-1"></a>Section 1 - Background 

One upon a time, there was a mayor who said: 

> We love Markdown so much that everyone should and must use it for everything. 

## Section 2 - Problem 

The mayor's directive, as stated in [Section 1](#seg-1), was sadly too difficult to enforce.

An asynchronous version of text2markdown() is also available, supporting all of the same features and arguments as its synchronous equivalent. It can be used like so:

from text2markdown import text2markdown_async

output = await text2markdown_async(text)
print(output)

All of the various capabilities of text2markdown can be toggled on or off using optional Boolean parameters, as shown below:

from text2markdown import text2markdown

from isaacus import Isaacus

output = text2markdown(
    text,
    link_xrefs=True,
    strike_junk=True,
    block_quotes=True,
    escape_lists=True,
    italicize_refs=True,
    italicize_terms=True,
    enrichment_model="kanon-2-enricher",
    isaacus_client=Isaacus(),
)
print(output)

License 📜

This library is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text2markdown-0.1.5.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

text2markdown-0.1.5-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file text2markdown-0.1.5.tar.gz.

File metadata

  • Download URL: text2markdown-0.1.5.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for text2markdown-0.1.5.tar.gz
Algorithm Hash digest
SHA256 b51fde4dc85aa8505b9ad8bf951b0f58fbf40837d71ef9368c2bf41eeac95c65
MD5 7b109bf78187725239e105a83e1e3979
BLAKE2b-256 44a402fe7e20d4b201cf148b356adc7b22dde9abcf2d6234b7c4544378e42405

See more details on using hashes here.

Provenance

The following attestation bundles were made for text2markdown-0.1.5.tar.gz:

Publisher: python-publish.yml on isaacus-dev/text2markdown

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file text2markdown-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: text2markdown-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for text2markdown-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 589808606045553f565179ebf5b168b4e46b085a7243bdc79bfa85575890af35
MD5 407684ef9e1b90370b8d515452ce6b17
BLAKE2b-256 ae7bf4958357ed12592970a024f91171da961d3c1f5c845ddf2c8d78f761f926

See more details on using hashes here.

Provenance

The following attestation bundles were made for text2markdown-0.1.5-py3-none-any.whl:

Publisher: python-publish.yml on isaacus-dev/text2markdown

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page