Skip to main content

Clipit is a library that allows you to download web pages, extract their readable content, convert it to Markdown, and save it locally.

Project description

Clipit

Clipit (previously named Grabit) is a command-line tool that allows you to download web pages, extract their readable content, convert it to Markdown, and save it locally.

It's ideal for archiving articles, blog posts, or any web content you may want to save forever and ever. It works well for feeding web content into LLMs too.

I'm using it to save bookmarks in Obsidian, so you'll see a lot of focus in this area (the YAML front matter, the domain subdirectory, etc.). But it's flexible enough to be used in other contexts as well.

It gets you from this to this
Raw html Markdown

Features

  • Download and convert web pages to Markdown: Fetches the content from a URL and converts it into clean Markdown format
  • Supports multiple output formats: Save content as Markdown, readable or raw HTML, or just send it to stdout so you can pipe it into another app
  • Customizable output: Include YAML front matter, page titles, source links, and control the output directory structure. This is especially useful for integrating with knowledge management systems such as Obsidian
  • Uses Readability.js: Extracts the main content from web pages for cleaner outputs (requires Node.js to be installed)
  • Supports Reddit posts: Clipit now handles Reddit (both text & link) posts (including comments)

Installation

  1. Ensure uv is installed
  2. Ensure Node.js is installed (optional, required for Readability.js, see below for options)
  3. Install clipit as a global uv tool (recommended, see below for an alternative)
uv tool install clipit
  1. Run clipit as any other CLI app
clipit -f stdout.md https://vladiliescu.net

Alternatively, if you don't want to install it, you can just run it with uvx: uvx clipit -f stdout.md https://vladiliescu.net. Keep in mind that this will cause uv to check for dependency updates every time you run it, causing you to lose 1-2 precious seconds every time you save something 🥶.

Usage

clipit [OPTIONS] URL

Options

  • --yaml-frontmatter / --no-yaml-frontmatter: Include YAML front matter with metadata, useful for saving & viewing content in Obsidian (default: enabled).
  • --include-title / --no-include-title: Include the page title as an H1 heading. A bit redundant when rendering the YAML frontmatter, but I like it anyway (default: enabled).
  • --include-source / --no-include-source: Include the page source URL at the top of the document. Also a bit redundant when rendering the YAML frontmatter, but this one I don't like so much (default: disabled).
  • --user-agent TEXT: Set a custom User-Agent to be used for retrieving web pages (default: Clipit/<version>).
  • --fallback-title TEXT: Fallback title if no title is found. Use {date} for the current date (default: Untitled {date}).
  • --use-readability-js / --no-use-readability-js: Use Readability.js for processing pages. Disabling it will result in some processing courtesy of ReadabiliPy, but it doesn't look so great to be honest (requires Node.js, default: enabled).
  • --create-domain-subdir / --no-create-domain-subdir: Save the resulting files in a subdirectory named after the domain. Useful when saving a lot of bookmarks in the same Obsidian vault (default: enabled).
  • --overwrite / --no-overwrite: Overwrite existing files (default: disabled).
  • -f, --format [md|stdout.md|html|raw.html]: Output format(s) to save the content in. Most useful are md, which saves the content to a Markdown file, and stdout.md which simply outputs the raw content so you can pipe it to something else, like the clipboard or Simon Willison's llm cli. Can be specified multiple times (default: md).

Examples

  • Save a web page as Markdown with the default options:
clipit https://example.com/article
  • Save as both Markdown and readable HTML:
clipit -f md -f html https://example.com/article
  • Set a custom User-Agent:
clipit --user-agent "MyCustomAgent/1.0" https://example.com/article
  • Output markdown content to stdout:
clipit -f stdout.md https://example.com/article
  • Output markdown content to clipboard (MacOS):
clipit -f stdout.md https://example.com/article | pbcopy
  • Disable YAML front matter and include source URL:
clipit --no-yaml-frontmatter --include-source https://example.com/article
  • Save files in the working directory, without creating a domain subdirectory:
clipit --no-create-domain-subdir https://example.com/article

Requirements

  • uv (for running the script)
  • Node.js (if using Readability.js)

License

Clipit, a tool for archiving web content, copyright (C) 2025 Vlad Iliescu

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. See the LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clipit-1.0.1.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clipit-1.0.1-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file clipit-1.0.1.tar.gz.

File metadata

  • Download URL: clipit-1.0.1.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for clipit-1.0.1.tar.gz
Algorithm Hash digest
SHA256 dc5290c3a19e4b7e263cdfee010bc7c7c29fb8d5da1d5e1c37459833eef88a02
MD5 074a5ac49ecaac18c8c98f31d56de1bd
BLAKE2b-256 884762d55aeed707643eeb9e0d574c542e44f254032be5ba8d49e95c1dbd1b04

See more details on using hashes here.

File details

Details for the file clipit-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: clipit-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for clipit-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ce8de6242be8fe9a6ee91f1a69d2b6f58310667c00c587e40676f4b17b54b818
MD5 4b0b48ffccc0bff90c06e62367681796
BLAKE2b-256 25ea22aa5a708a29cae8c04c5a9d06f56d851381e7e851480bc51f1d71cac51c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page