A tool to fetch the main content of a webpage and convert it to Markdown or plain text.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

webclipper

webclipper is a simple Python tool to fetch the main content of a webpage and convert it into clean, readable Markdown or plain text. It removes clutter like ads, headers, and navigation bars, letting you focus on the article's text.

It can be used as a command-line application for quick conversions in your terminal or as a library in your own Python projects.

Features

Content Extraction: Uses readibility to identify and extract the primary article or content from a URL.
Dual Output: Convertis cleaned HTML to either Markdown or plain text.
Flexible Usage: Works as both a standalone command-line tool and an importable Python library.

Installation

To install webclipper, you can clone the repository and install it using pip.


# Clone the repository (if you haven't already)

git clone [https://github.com/your-username/webclipper.git](https://www.google.com/search?q=https://github.com/your-username/webclipper.git)
cd webclipper

# Install the package in editable mode

# (Your changes to the source code will be reflected immediately)

pip install -e .

This will install the package and its dependencies, and also make the webclipper command available in your terminal.

How to Use

As a Command-Line App

Once installed, you can use the webclipper command directly from your terminal. The output is sent to standard output, so you can easily redirect it to a file.

Basic Usage (get plain text):


webclipper "[https://en.wikipedia.org/wiki/Python\_(programming\_language](https://en.wikipedia.org/wiki/Python_\(programming_language\))"

Get Markdown Output:

Use the -m or --markdown flag.


webclipper "[https://www.some-article-url.com](https://www.google.com/search?q=https://www.some-article-url.com)" --markdown

Include the Source URL:

Use the -i or --include-url flag to append the source URL at the end of the output.


webclipper "[https://www.some-article-url.com](https://www.google.com/search?q=https://www.some-article-url.com)" -m -i

Redirect to a File:

You can save the output using standard shell redirection.


webclipper "[https://www.some-article-url.com](https://www.google.com/search?q=https://www.some-article-url.com)" \> my\_article.txt

As a Library

You can also import webclipper into your own Python scripts to integrate its functionality. The get_url_content function is all you need.

from webclipper import get\_url\_content

# The URL of the article you want to clip

article\_url = "[https://en.wikipedia.org/wiki/Web\_scraping](https://en.wikipedia.org/wiki/Web_scraping)"

try:
    # Get the content as Markdown
    markdown\_content = get\_url\_content(article\_url, output\_format='markdown')
    print("--- MARKDOWN ---")
    print(markdown\_content)

    # Get the content as plain text
    text_content = get_url_content(article_url, output_format='text')
    print("\n--- PLAIN TEXT ---")
    print(text_content)

except Exception as e:
    print(f"An error occurred: {e}")

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.2.0

Jun 20, 2025

0.1.2

Jun 20, 2025

This version

0.1.1

Jun 20, 2025

0.1.0

Jun 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webclipper-0.1.1.tar.gz (3.6 kB view details)

Uploaded Jun 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

webclipper-0.1.1-py3-none-any.whl (4.0 kB view details)

Uploaded Jun 20, 2025 Python 3

File details

Details for the file webclipper-0.1.1.tar.gz.

File metadata

Download URL: webclipper-0.1.1.tar.gz
Upload date: Jun 20, 2025
Size: 3.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for webclipper-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`a8e1b8e814155c4424c81552faa5afb0c44de48a054cf2b7a17b96e34aef8fcb`
MD5	`104e9d7affab0da3f91a9bdad2d5792b`
BLAKE2b-256	`1ccc5f5014f7c902580771b9462dfee7860e7f92dda6043b47d72b859cb907df`

See more details on using hashes here.

File details

Details for the file webclipper-0.1.1-py3-none-any.whl.

File metadata

Download URL: webclipper-0.1.1-py3-none-any.whl
Upload date: Jun 20, 2025
Size: 4.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for webclipper-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`501c42a4da1126905da22d91926a63091a3e17c4a0bc3a37270df2fe571b97ef`
MD5	`84a2c5aaf71831c5b78327d968f5d957`
BLAKE2b-256	`3757deeb57d75b948dae64ba8cd9f89e78da7cdef42944afc9c11c7385df88cb`

See more details on using hashes here.

webclipper 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

webclipper

Features

Installation

How to Use

As a Command-Line App

As a Library

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes