Makes a complete archive of imageboard threads including images, HTML, and JSON.
Project description
BASC Archiver
The BASC Archiver is a Python library (packaged with the thread-archiver script) used to archive imageboard threads. It uses the 4chan API with the py4chan wrapper. Developers are free to use the BASC-Archiver library for some interesting third-party applications, as it is licensed under the LGPLv3.
It comes with a CLI interface for archiving threads, the thread-archiver, with a GUI interface under development.
The thread-archiver is designed to archive all content from a 4chan thread:
Download all images and/or thumbnails in given threads.
(NEW!) Download all child threads (4chan threads referred to in a post)
Download a JSON dump of thread comments using the 4chan API.
Download the HTML page
Convert links in HTML to use the downloaded images
Download CSS/JS and convert HTML to use them
Keep downloading until 404 (with a user-set delay)
Can be restarted at any time
The thread-archiver replaces the typical “Right-click Save As, Web Page Complete” action, which does not save full-sized images or JSON. It works as a guerilla, static HTML alternative to Fuuka.
Usage
Usage: thread-archiver <url>... [options] thread-archiver -h | --help thread-archiver -v | --version Options: --path=<string> Path to folder where archives will be saved [default: ./archive] --runonce Downloads the thread as it is presently, then exits --delay=<float> Delay between file downloads [default: 0] --poll-delay=<float> Delay between thread checks [default: 20] --nothumbs Don't download thumbnails --thumbsonly Download thumbnails, no images --ssl Download using HTTPS --follow-children Follow threads linked in downloaded threads --follow-to-other-boards Follow linked threads, even if from other boards --silent Suppresses mundane printouts, prints what's important --verbose Printout more information than normal -h --help Show help -v --version Show version
Example
thread-archiver http://boards.4chan.org/b/res/423861837 --delay 5 --thumbsonly
Installation
The BASC-Archiver works on both Python 2.x and 3.x, and can be installed on Windows, Linux, or Mac OS X.
New stable releases can be found on our Releases page, or installed with the PyPi package BASC-Archiver.
Linux and OSX
Make sure you have Python installed.
Run easy_install basc-archiver
Run thread-archiver http://boards.4chan.org/etc/thread/12345
Threads will be saved in ./archive, but you can change that by supplying a directory with the --path= argument.
Windows
Download the latest release from our page.
Open up a command prompt window (cmd.exe), and move to the directory with thread-archiver.exe
Run thread-archiver.exe http://boards.4chan.org/etc/thread/12345
Using the Windows version will become simpler once we finish writing the GUI.
License
BASC-Archiver is licensed under the GNU Lesser General Public License v3.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file BASC-Archiver-0.8.4.tar.gz
.
File metadata
- Download URL: BASC-Archiver-0.8.4.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55ee41e0da2391d5e15ca425669d7fb22091c7f4d5de5c0a6475c8b4bacf9e0c |
|
MD5 | cb30ed29932814adc6c594d6efaf3041 |
|
BLAKE2b-256 | e6c2140395b4076e8a4414285ac11308c16d383192be759a5b45adc6a6dbecb1 |