Library for working with works from Projekt Runeberg (Runeberg.org).
Project description
runeberg 

A library and command line application for downloading and parsing works from Projekt Runeberg.
Installation
You can install the Runeberg from PyPI:
pip install runeberg
It is supported on Python 3.6 and above.
Usage as a command line application
After installing runeberg simply call the program to get a paged output of
works to download, follow the prompts to download (and unpack) the files.
$ runeberg
1. "Det Ringer!" Skämt i en akt (1902) by Helena Nyblom [sv]
2. "Då sa' kungen..." : Kungliga anekdoter under hundra år (1946) by ? [sv]
3. "Pastoralier" (1899) by August Olsson [sv]
4. "The Ripper" (uppskäraren) (1892) by Adolf Paul [sv]
5. 100 Præstehistorier eller Præstestandens lyse og mørke Sider (1893) by Nils Poulsen [no]
6. 14 Descriptive Pieces for the Young for Piano (1895) by Sveinbjörn Sveinbjörnsson [en]
7. 14 sovjetryska berättare : valda och översatta från ryskan (1929) by ? [sv]
8. 16 år med Roald Amundsen. Fra Pol til Pol (1930) by Oscar Wisting [no]
9. 1720, 1772, 1809 (1836) by Magnus Crusenstolpe [sv]
…
What do you want to do? [1–25] to download, [N]ext 25, [Q]uit: █
Use the -a flag to start with a list of authors for which a filtered list of
works will be presented:
$ runeberg -a
1. Ülev Aaloe (1944) [ee]
2. Simon Aberstén (1865–1937) [se]
3. Selma Abrahamsson (1872–1911) [fi]
4. Arthur Dyke Acland (1847–1926) [uk]
5. Adam Bremensis (1044–1080) [de]
6. Gertrud Adelborg (1853–1942) [se]
7. Ottilia Adelborg (1855–1936) [se]
8. Gudmund Jöran Adlerbeth (1751–1818) [se]
9. Gustav Magnus Adlercreutz (1775–1845) [se]
…
What do you want to do? [1–25] to display their works, [N]ext 25, [Q]uit: 6
Displaying works by Gertrud Adelborg [uid=adelbger]…
1. Några drag af de till Danmark utvandrade allmogeflickornas ställning och arbetsförhållanden (1890) by Gertrud Adelborg [sv]
2. Några upplysningar angående de svenska allmogeflickornas utvandring till Danmark (1893) by Gertrud Adelborg [sv]
What do you want to do? [1–2] to download, [Q]uit: █
Use the -h flag to see a full list of options and filters.
Usage as a library
First determine the identifier of the work you wish to download. For e.g.
http://runeberg.org/aldrigilif/ this <uid> would be aldrigilif.
# Download and unpack a work from runeberg.org:
# this will by default download the work to /downloaded_data/<uid>/
import runeberg.download as downloader
downloader.get_work('<uid>')
# Warning raised if additional colour images are found, these are not unpacked.
# Parse the downloaded work:
# from the parsed work you can access individual pages, articles/chapters along
# with any metadata
import runeberg
parsed_work = runeberg.Work.from_files('<uid>')
# Create a DjVu file of the work
print(parsed_work.to_djvu()) # outputs the path to the created file
Caveats
Some of the Metadata files are encoded in Windows 1252 rather than the
default latin-1. The framework does not currently detect this. If you
encounter such a file some characters may be misinterpreted and you must
manually re-encode the file before parsing the work.
If the originally scanned images were .jpg then the downloaded "colour
images" will just be a second identical copy of these.
Requirements
For DjVu conversion DjVuLibre must be installed.
Change log
0.0.2
- [Breaking] Rename
ocrprpoerty ofPageastext. - Introduce
textproperty toWorkandArticle. - Re-use djvu file generated by earlier run. Add
forceargument to avoid reuse. - Parse the
IMAGE_SOURCEmetadata. - Expand testing to py37, py38
0.0.1
- Initial PyPI release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file runeberg-0.0.2.tar.gz.
File metadata
- Download URL: runeberg-0.0.2.tar.gz
- Upload date:
- Size: 21.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18fa4a83a96e81b9ad67b1037e859b7dc6d61c29ad6e6a58206b5f2a203505bc
|
|
| MD5 |
7ee38476f71fdd548087a35ea9deb7f2
|
|
| BLAKE2b-256 |
fb911d9fdab573baad80fd33cac5cbe87f7cee9ad2a5c4d4d8b09b0e9eaaa0e0
|
File details
Details for the file runeberg-0.0.2-py3-none-any.whl.
File metadata
- Download URL: runeberg-0.0.2-py3-none-any.whl
- Upload date:
- Size: 23.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40585fce5d7fa9dc065c80162e44093c65c1bdcf7a220d6b3ac9fc008f784632
|
|
| MD5 |
50c183fa41843103df7cc296a7c2a136
|
|
| BLAKE2b-256 |
eeff3404bd47fbc4dec30a815880ffb9e38674d5d3f1d95dd5aa67a6415789f7
|