Skip to main content

A library to parse Markdown and extract code snippets

Project description

PyParseit

PyParseit is a Python library designed to parse Markdown files and strings to extract code snippets based on specific programming languages. It provides a simple and intuitive interface for developers who want to quickly extract and utilize code blocks from Markdown documents or strings.

Table of Contents

Introduction

Markdown is widely used for documentation, blogging, and technical writing. PyParseit simplifies the task of extracting code snippets from Markdown files and strings, making it ideal for static site generators, content management systems, and more.

Features

  • Parse Markdown files or strings to extract code blocks.
  • Filter code snippets by programming language (e.g., Python, JavaScript, JSON).
  • Command-line interface for easy integration into scripts and workflows.
  • Customizable and extensible API for developers.

Installation

You can install PyParseit via pip:

pip install pyparseit

Alternatively, you can clone the repository and install it manually:

git clone https://github.com/uladkaminski/pyparseit.git
cd pyparseit
python setup.py install

Usage

Parsing from a File

Here's a basic example of how to use PyParseit to extract Python code snippets from a Markdown file:

from pyparseit import parse_markdown_file

# Specify the file path and language
file_path = 'example_file.md'
language = 'python'

# Parse the Markdown file to extract Python code snippets
python_snippets = parse_markdown_file(file_path, language=language)

# Display extracted Python snippets from the file
print("Extracted Python Snippets from File:")
for snippet in python_snippets:
  print(f"Language: {snippet.language}\nContent:\n{snippet.content}\n")

Parsing from a String

PyParseit can also parse Markdown content directly from a string:

from pyparseit import parse_markdown_string

# Define a Markdown string with multiple code blocks
markdown_string = """
# Sample Markdown

Here is some Python code:

\`\`\`python
def hello_world():
    print("Hello, world!")
\`\`\`

Here is some JavaScript code:

\`\`\`javascript
function helloWorld() {
    console.log("Hello, world!");
}
\`\`\`

And here is some JSON:

\`\`\`json
{
    "name": "John",
    "age": 30
}
\`\`\`
"""

# Parse the Markdown string to extract JSON code snippets
json_snippets = parse_markdown_string(markdown_string, language='json')

# Display extracted JSON snippets from the string
print("Extracted JSON Snippets from String:")
for snippet in json_snippets:
  print(f"Language: {snippet.language}\nContent:\n{snippet.content}\n")

Command-Line Interface

PyParseit also provides a CLI for easy usage from the terminal:

pyparseit path/to/your/file.md -l python -o output.txt

This command parses the specified Markdown file, extracts Python code snippets, and saves them to output.txt.

Examples

Check out the examples directory for more use cases and demonstrations of how to integrate PyParseit into your projects.

API Reference

parse_markdown_file

  • Description: Parses Markdown content from a file to extract code snippets.
  • Parameters:
    • file_path (str): The path to the Markdown file.
    • language (Optional[str]): The programming language to filter snippets by.
  • Returns: List[CodeSnippet]: A list of extracted code snippets.

parse_markdown_string

  • Description: Parses Markdown content from a string to extract code snippets.
  • Parameters:
    • markdown_string (str): The Markdown content as a string.
    • language (Optional[str]): The programming language to filter snippets by.
  • Returns: List[CodeSnippet]: A list of extracted code snippets.

CodeSnippet

  • Description: Represents a code snippet with language and content.
  • Attributes:
    • language: str: The programming language of the code snippet.
    • content: str: The code snippet's content.

Exceptions

  • PyParsecError: Base exception for parser errors.

Contributing

Contributions are welcome! If you'd like to contribute to PyParseit, please follow these steps:

  1. Fork the repository.
  2. Create a new branch for your feature or bugfix.
  3. Implement your changes and add tests if applicable.
  4. Commit your changes and push to your fork.
  5. Submit a pull request with a description of your changes.

Please ensure your code adheres to the project's coding standards and passes all tests.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyparseit-0.1.2.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyparseit-0.1.2-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file pyparseit-0.1.2.tar.gz.

File metadata

  • Download URL: pyparseit-0.1.2.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for pyparseit-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ac3bb0a55fdfd7b84c5a808ec9fb02befde176ca12af0e015812eed837e8ee11
MD5 b53f7a3f6370297ec79259476fde53ac
BLAKE2b-256 ec3bd9ffd59f53b175a871e74a82c4c9f3be160af599a3e09c24fdf5f7e03eb2

See more details on using hashes here.

File details

Details for the file pyparseit-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pyparseit-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for pyparseit-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 be0855c7d52492c4180ed2f2663845b95d7e3f89a954206b86c7eef8b2b2a8b8
MD5 e8d9ed921095ff8310f4305987cf9ee7
BLAKE2b-256 b2ff60519a82b5c62af0d1ed7470baeba91b991004d1c60f0c04d767398b1ceb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page