A tool to scrape articles from Instapaper.

These details have not been verified by PyPI

Project links

Project description

Instapaper Scraper

A powerful and reliable Python tool to automate the export of all your saved Instapaper bookmarks into various formats, giving you full ownership of your data.

✨ Features

Scrapes all bookmarks from your Instapaper account.
Supports scraping from specific folders.
Exports data to CSV, JSON, or a SQLite database.
Securely stores your session for future runs.
Modern, modular, and tested architecture.

🚀 Getting Started

📋 1. Requirements

Python 3.9+

📦 2. Installation

This package is available on PyPI and can be installed with pip:

pip install instapaper-scraper

💻 3. Usage

Run the tool from the command line, specifying your desired output format:

# Scrape and export to the default CSV format
instapaper-scraper

# Scrape and export to JSON
instapaper-scraper --format json

# Scrape and export to a SQLite database with a custom name
instapaper-scraper --format sqlite --output my_articles.db

⚙️ Configuration

🔐 Authentication

The script authenticates using one of the following methods, in order of priority:

Command-line Arguments: Provide your username and password directly when running the script:
```
instapaper-scraper --username your_username --password your_password
```
Session Files (.session_key, .instapaper_session): The script attempts to load these files in the following order: a. Path specified by --session-file or --key-file arguments. b. Files in the current working directory (e.g., ./.session_key). c. Files in the user's configuration directory (~/.config/instapaper-scraper/). After the first successful login, the script creates an encrypted .instapaper_session file and a .session_key file to reuse your session securely.
Interactive Prompt: If no other method is available, the script will prompt you for your username and password.

Note on Security: Your session file (.instapaper_session) and the encryption key (.session_key) are stored with secure permissions (read/write for the owner only) to protect your credentials.

📁 Folder Configuration

You can define and quickly access your Instapaper folders using a config.toml file. The scraper will look for this file in the following locations (in order of precedence):

The path specified by the --config-path argument.
config.toml in the current working directory.
~/.config/instapaper-scraper/config.toml

Here is an example of config.toml:

# Default output filename for non-folder mode
output_filename = "home-articles.csv"

[[folders]]
key = "ml"
id = "1234567"
slug = "machine-learning"
output_filename = "ml-articles.json"

[[folders]]
key = "python"
id = "7654321"
slug = "python-programming"
output_filename = "python-articles.db"

output_filename (top-level): The default output filename to use when not in folder mode.
key: A short alias for the folder.
id: The folder ID from the Instapaper URL.
slug: The human-readable part of the folder URL.
output_filename (folder-specific): A preset output filename for scraped articles from this specific folder.

When a config.toml file is present and no --folder argument is provided, the scraper will prompt you to select a folder. You can also specify a folder directly using the --folder argument with its key, ID, or slug. Use --folder=none to explicitly disable folder mode and scrape all articles.

💻 Command-line Arguments

Argument	Description
`--config-path <path>`	Path to the configuration file. Searches `~/.config/instapaper-scraper/config.toml` and `config.toml` in the current directory by default.
`--folder <value>`	Specify a folder by key, ID, or slug from your `config.toml`. Requires a configuration file to be loaded. Use `none` to explicitly disable folder mode. If a configuration file is not found or fails to load, and this option is used (not set to `none`), the program will exit.
`--format <format>`	Output format (`csv`, `json`, `sqlite`). Default: `csv`.
`--output <filename>`	Specify a custom output filename. The file extension will be automatically corrected to match the selected format.
`--username <user>`	Your Instapaper account username.
`--password <pass>`	Your Instapaper account password.
`--add-instapaper-url`	Adds a `instapaper_url` column to the output, containing a full, clickable URL for each article.

📄 Output Formats

You can control the output format using the --format argument. The supported formats are:

csv (default): Exports data to output/bookmarks.csv.
json: Exports data to output/bookmarks.json.
sqlite: Exports data to an articles table in output/bookmarks.db.

If the --format flag is omitted, the script will default to csv.

When using --output <filename>, the file extension is automatically corrected to match the chosen format. For example, instapaper-scraper --format json --output my_articles.txt will create my_articles.json.

📖 Opening Articles in Instapaper

The output data includes a unique id for each article. You can use this ID to construct a URL to the article's reader view: https://www.instapaper.com/read/<article_id>.

For convenience, you can use the --add-instapaper-url flag to have the script include a full, clickable URL in the output.

instapaper-scraper --add-instapaper-url

This adds a instapaper_url field to each article in the JSON output and a instapaper_url column in the CSV and SQLite outputs. The original id field is preserved.

🛠️ How It Works

The tool is designed with a modular architecture for reliability and maintainability.

Authentication: The InstapaperAuthenticator handles secure login and session management.
Scraping: The InstapaperClient iterates through all pages of your bookmarks, fetching the metadata for each article with robust error handling and retries. Shared constants, like the Instapaper base URL, are managed through src/instapaper_scraper/constants.py.
Data Collection: All fetched articles are aggregated into a single list.
Export: Finally, the collected data is written to a file in your chosen format (.csv, .json, or .db).

📊 Example Output

📄 CSV (`output/bookmarks.csv`) (with --add-instapaper-url)

"id","instapaper_url","title","url"
"999901234","https://www.instapaper.com/read/999901234","Article 1","https://www.example.com/page-1/"
"999002345","https://www.instapaper.com/read/999002345","Article 2","https://www.example.com/page-2/"

📄 JSON (`output/bookmarks.json`) (with --add-instapaper-url)

[
    {
        "id": "999901234",
        "title": "Article 1",
        "url": "https://www.example.com/page-1/",
        "instapaper_url": "https://www.instapaper.com/read/999901234"
    },
    {
        "id": "999002345",
        "title": "Article 2",
        "url": "https://www.example.com/page-2/",
        "instapaper_url": "https://www.instapaper.com/read/999002345"
    }
]

🗄️ SQLite (`output/bookmarks.db`)

A SQLite database file is created with an articles table. The table includes id, title, and url columns. If the --add-instapaper-url flag is used, a instapaper_url column is also included. This feature is fully backward-compatible and will automatically adapt to the user's installed SQLite version, using an efficient generated column on modern versions (3.31.0+) and a fallback for older versions.

🤗 Support and Community

🐛 Bug Reports: For any bugs or unexpected behavior, please open an issue on GitHub.
💬 Questions & General Discussion: For questions, feature requests, or general discussion, please use our GitHub Discussions.

🙏 Support the Project

Instapaper Scraper is a free and open-source project that requires significant time and effort to maintain and improve. If you find this tool useful, please consider supporting its development. Your contribution helps ensure the project stays healthy, active, and continuously updated.

Sponsor on GitHub: The best way to support the project with recurring monthly donations. Tiers with special rewards like priority support are available!
Buy Me a Coffee: Perfect for a one-time thank you.

🤝 Contributing

Contributions are welcome! Whether it's a bug fix, a new feature, or documentation improvements, please feel free to open a pull request.

Please read the Contribution Guidelines before you start.

🧑‍💻 Development & Testing

This project uses pytest for testing, ruff for code formatting and linting, and mypy for static type checking.

🔧 Setup

To install the development dependencies:

pip install -e .[dev]

To set up the pre-commit hooks:

pre-commit install

▶️ Running the Scraper

To run the scraper directly without installing the package:

python -m src.instapaper_scraper.cli

✅ Testing

To run the tests, execute the following command from the project root:

pytest

To check test coverage:

pytest --cov=src/instapaper_scraper --cov-report=term-missing

✨ Code Quality

To format the code with ruff:

ruff format .

To check for linting errors with ruff:

ruff check .

To automatically fix linting errors:

ruff check . --fix

To run static type checking with mypy:

mypy src

To run license checks:

licensecheck --show-only-failing

📜 Disclaimer

This script requires valid Instapaper credentials. Use it responsibly and in accordance with Instapaper’s Terms of Service.

📄 License

This project is licensed under the terms of the GNU General Public License v3.0. See the LICENSE file for the full license text.

Contributors

Made with contrib.rocks.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.1

Apr 28, 2026

1.3.1rc1 pre-release

Apr 22, 2026

1.3.0

Apr 8, 2026

1.3.0rc1 pre-release

Mar 24, 2026

1.2.0

Feb 3, 2026

1.2.0rc1 pre-release

Jan 25, 2026

This version

1.1.1

Dec 30, 2025

1.1.0

Dec 25, 2025

1.1.0rc1 pre-release

Dec 19, 2025

1.0.0.post1

Nov 24, 2025

1.0.0

Nov 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

instapaper_scraper-1.1.1.tar.gz (46.0 kB view details)

Uploaded Dec 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

instapaper_scraper-1.1.1-py3-none-any.whl (30.6 kB view details)

Uploaded Dec 30, 2025 Python 3

File details

Details for the file instapaper_scraper-1.1.1.tar.gz.

File metadata

Download URL: instapaper_scraper-1.1.1.tar.gz
Upload date: Dec 30, 2025
Size: 46.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for instapaper_scraper-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`b710e2189ef6ca985e9d75d691680686174c5f7e533e24a5281cf4f6bbc3970d`
MD5	`663fe375d6ab8a901c0ebbcd820fd97a`
BLAKE2b-256	`ef7d5180166057db0f0c875e284d11f2964eb3cffc00ac0996772f5ae13e0300`

See more details on using hashes here.

File details

Details for the file instapaper_scraper-1.1.1-py3-none-any.whl.

File metadata

Download URL: instapaper_scraper-1.1.1-py3-none-any.whl
Upload date: Dec 30, 2025
Size: 30.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for instapaper_scraper-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a6bcd611ce1ff8deec181108e697012f1c5360ac06d4f45955ca3b90c55cae9b`
MD5	`6df4cbf2982d70461aafe4f2fa3b93bf`
BLAKE2b-256	`08e2307c8d5c8f5cd115c4c82240c923219f0b2e3fb3072db60ec87ee23e81c9`

See more details on using hashes here.

instapaper-scraper 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Instapaper Scraper

✨ Features

🚀 Getting Started

📋 1. Requirements

📦 2. Installation

💻 3. Usage

⚙️ Configuration

🔐 Authentication

📁 Folder Configuration

💻 Command-line Arguments

📄 Output Formats

📖 Opening Articles in Instapaper

🛠️ How It Works

📊 Example Output

📄 CSV (output/bookmarks.csv) (with --add-instapaper-url)

📄 JSON (output/bookmarks.json) (with --add-instapaper-url)

🗄️ SQLite (output/bookmarks.db)

🤗 Support and Community

🙏 Support the Project

🤝 Contributing

🧑‍💻 Development & Testing

🔧 Setup

▶️ Running the Scraper

✅ Testing

✨ Code Quality

📜 Disclaimer

📄 License

Contributors

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

📄 CSV (`output/bookmarks.csv`) (with --add-instapaper-url)

📄 JSON (`output/bookmarks.json`) (with --add-instapaper-url)

🗄️ SQLite (`output/bookmarks.db`)