Skip to main content

Archive your web page.

Project description

PageSaver

Python version PyPI package PyPI download GitHub GitHub last commit (by committer)

Archive your web page.

Requirements

  • Python >= 3.8

Install

pip
pip install pagesaver
✨🍰✨

Or you can use pip install git+https://github.com/ZhaoQi99/PageSaver.git install latest version.

docker
docker run -d --name pagesaver -p 8001:8001 zhaoqi99/pagesaver

Quick Start

HTTP API

  1. Init PageSaver: pagesaver init
  2. Start HTTP Server: pagesaver server

    nohup pagesaver server >> server.log 2>&1 &

  3. Examples:
~$ curl http://127.0.0.1:8001/api/record/https://www.baidu.com/?format=MHTML&format=PDF -H 'Authorization: <API_TOKEN>'
~$ curl http://127.0.0.1:8001/api/record/notion/https://www.baidu.com/?format=MHTML&format=PDF&api_token=api_token&database_id=1&token_v2=token_v2&title=test -H 'Authorization: <API_TOKEN>'

CLI

pagesaver export https://www.baidu.com -o . -f MHTML,PDF

HTTP Usage

Authorization

Using the Authorization header, format is: Authorization: <API_TOKEN>

Record API

  • GET api/record/{url}?format=MHTML&format=PDF

Query Params

Parameter Type Required Description
format string No Storage format, can be MHTML or PDF, defaults to all.

Notion Push API

  • GET api/record/notion/{url}?format=MHTML&format=PDF&api_token=<NOTION_API_TOKEN>&database_id=<NOTION_DATABASE_ID>&token_v2=<NOTION_TOKEN_V2>&title=test
  • Notion API Token
  • Notion Token V2: F12 -> Application -> Cookies -> token_v2
  • Database ID: https://www.notion.so/{USERNAME}/{DATABASE_ID}
  • Connection with: Notion ->Top right corner -> More -> Connections -> Connect to -> Your Integration

Automations

IOS Shortcut

Query Params

Parameter Type Required Description
format string No Storage format, can be MHTML or PDF, defaults to all.
api_token* string Yes Notion API Token
database_id* string Yes Notion Database ID
title string No Title stored in Notion.
token_v2 string No Obtained from Browser->Cookies->token_v2.To store files in Notion, this parameter is required.

CLI Usage

Export

~$ pagesaver export -h
Usage: pagesaver export [OPTIONS] URL

  Export page to the output file

Options:
  -f, --format [MHTML,PDF]  Format which you want to export  [required]
  -o, --output DIRECTORY    Output directory of the file  [required]
  -n, --name TEXT           Name of the exported file  [default: exported]
  -h, --help                Show this message and exit.

Server

~$ pagesaver init
~$ pagesaver server -h
Usage: pagesaver server [OPTIONS]

  Run PageSaver HTTP server

Options:
  -h, --help       Show this message and exit.
  -b, --bind TEXT  The TCP host/address to bind to.  [default: 0.0.0.0:8001]

Configuration

PageSaver will read the configuration from config.py automatically.

STORAGE

  • type: storage type. Currently supported values are "local".
  • path: path of storage.This is only used when type is set to "local".

SERVER_BIND

The TCP host/address to bind to.

Default: 0.0.0.0:8001

TITLE_PROPERTY

The property name in Notion to use for the title of a page.

Default: title

LINK_PROPERTY

The property name in Notion to use for the link of a page.

Default: link

MHTML_PROPERTY

The property name in Notion to use for the MHTML file of a page.

Default: mhtml

License

GNU General Public License v3.0

Author

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pagesaver-0.1.0.tar.gz (31.8 kB view hashes)

Uploaded Source

Built Distribution

pagesaver-0.1.0-py3-none-any.whl (38.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page