Skip to main content

Archive your web page.

Project description

PageSaver

Python version PyPI package PyPI download GitHub GitHub last commit (by committer)

Archive your web page.

Requirements

  • Python >= 3.8

Install

pip
pip install pagesaver
✨🍰✨

Or you can use pip install git+https://github.com/ZhaoQi99/PageSaver.git install latest version.

docker
docker run -d --name pagesaver -p 8001:8001 zhaoqi99/pagesaver

Quick Start

HTTP API

  1. Init PageSaver: pagesaver init
  2. Start HTTP Server: pagesaver server

    nohup pagesaver server >> server.log 2>&1 &

  3. Examples:
~$ curl http://127.0.0.1:8001/api/record/https://www.baidu.com/?format=MHTML&format=PDF -H 'Authorization: <API_TOKEN>'
~$ curl http://127.0.0.1:8001/api/record/notion/https://www.baidu.com/?format=MHTML&format=PDF&api_token=api_token&database_id=1&token_v2=token_v2&title=test -H 'Authorization: <API_TOKEN>'

CLI

pagesaver export https://www.baidu.com -o . -f MHTML,PDF

HTTP Usage

Authorization

Using the Authorization header, format is: Authorization: <API_TOKEN>

Record API

  • GET api/record/{url}?format=MHTML&format=PDF

Query Params

Parameter Type Required Description
format string No Storage format, can be MHTML or PDF, defaults to all.

Notion Push API

  • GET api/record/notion/{url}?format=MHTML&format=PDF&api_token=<NOTION_API_TOKEN>&database_id=<NOTION_DATABASE_ID>&token_v2=<NOTION_TOKEN_V2>&title=test
  • Notion API Token
  • Notion Token V2: F12 -> Application -> Cookies -> token_v2
  • Database ID: https://www.notion.so/{USERNAME}/{DATABASE_ID}
  • Connection with: Notion ->Top right corner -> More -> Connections -> Connect to -> Your Integration

Automations

IOS Shortcut

Query Params

Parameter Type Required Description
format string No Storage format, can be MHTML or PDF, defaults to all.
api_token* string Yes Notion API Token
database_id* string Yes Notion Database ID
title string No Title stored in Notion.
token_v2 string No Obtained from Browser->Cookies->token_v2.To store files in Notion, this parameter is required.

CLI Usage

Export

~$ pagesaver export -h
Usage: pagesaver export [OPTIONS] URL

  Export page to the output file

Options:
  -f, --format [MHTML,PDF]  Format which you want to export  [required]
  -o, --output DIRECTORY    Output directory of the file  [required]
  -n, --name TEXT           Name of the exported file  [default: exported]
  -h, --help                Show this message and exit.

Server

~$ pagesaver init
~$ pagesaver server -h
Usage: pagesaver server [OPTIONS]

  Run PageSaver HTTP server

Options:
  -h, --help       Show this message and exit.
  -b, --bind TEXT  The TCP host/address to bind to.  [default: 0.0.0.0:8001]

Configuration

PageSaver will read the configuration from config.py automatically.

STORAGE

  • type: storage type. Currently supported values are "local".
  • path: path of storage.This is only used when type is set to "local".

SERVER_BIND

The TCP host/address to bind to.

Default: 0.0.0.0:8001

TITLE_PROPERTY

The property name in Notion to use for the title of a page.

Default: title

LINK_PROPERTY

The property name in Notion to use for the link of a page.

Default: link

MHTML_PROPERTY

The property name in Notion to use for the MHTML file of a page.

Default: mhtml

License

GNU General Public License v3.0

Author

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pagesaver-0.1.0.tar.gz (31.8 kB view details)

Uploaded Source

Built Distribution

pagesaver-0.1.0-py3-none-any.whl (38.5 kB view details)

Uploaded Python 3

File details

Details for the file pagesaver-0.1.0.tar.gz.

File metadata

  • Download URL: pagesaver-0.1.0.tar.gz
  • Upload date:
  • Size: 31.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for pagesaver-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0c05c9d0192761a2850439f455be8e594ef8cafd618201b800bd8b28de2dd790
MD5 887805299d61987df5bc5ef15df26975
BLAKE2b-256 d6c6fb384f68553962e33315e2231cb7b4a87660ac2ae6e855b29932785aca4b

See more details on using hashes here.

File details

Details for the file pagesaver-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pagesaver-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 38.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for pagesaver-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ab76e6ddaec0dc04f35f27727e301d4de7b3d92e74e3cc77faac18c06429a540
MD5 c7bc7aacbb751e033f994a1005ae8a29
BLAKE2b-256 adf69164f5b2d7747e4c81bc3337c83c79c45136e79f9bd3562e5b1211fbe7b8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page