Skip to main content

A Django book reading application that utilizes Djatoka and a DSpace repository to present browseable views of scanned books.

Project description

Introduction

This is a book reading UI application for items that have been deposited in a DSpace repository that has the OAI-PMH repository views enabled.

Using the bookreader

Installation

Package

Install the bookreader Python package by doing one of the following:

  • Adding it to your list of eggs in buildout.cfg and running bin/buildout
  • Use easy_install bookreader to install it globally
  • Download the source code, unpack it and run python setup.py install

Django

In your project’s settings.py, add bookreader to the list of INSTALLED_APPS and run manage.py syncdb to add the tables to the database. Then include bookreader.urls in your url patterns. Last, you will need to add the base URL for your Djatoka server in your settings file as DJATOKA_BASE_URL.

Administration

The bookreader uses the Django admin interface for the manipulation of the data. First, you must add a Repository. The only input required is the OAI-PMH server url of your DSpace repository. Using that url, Django will connect to the server and query for its name. Then collections can be added by DSpace handle. Once a handle is entered, Django will query the server to validate that it is a valid handle and query for the name of the collection. At this point, it will also query for all books (items) in the collection and harvest the metadata and pages for those books. This automatic behavior can be disabled by setting BOOKREADER_SIGNALS_ENABLED to False in your settings.

DSpace Requirements

Item and Bitstream Layout

For the use case of this Django UI, books are DSpace items and pages are bitstreams attached to those book items. The page bitstreams must be JPEG2000 files in the ‘ORIGINAL’ or main content bundle. Thumbnails can be provided for each page by adding a jpeg file in a ‘THUMBNAILS’ bundle that has the same base filename as the original page bitstream. For example, if the page bitstream filename is tamu_pl_0001.jpf then the thumbnail must end with tamu_pl_0001.jpg. In this case thumbs/tamu_pl_0001.jpg would also be acceptable.

OAI-PMH Repository

In order for the data be harvested, the DSpace OAI-PMH server must have the ORE and DIM metadata prefixes enabled and compatible crosswalks installed for the books.

Metadata Fields

DIM data

The DIM metadata prefix is used to gather the book (item) metadata. The fields are harvested using the pyoai MetadataReader that is extended in the Python dspace library, NestedMetadataReader. XPath evaluators are used to map xml elements to fields. For the current mapping, see bookreader.harvesting.metadata.dim_reader.

ORE data

The ORE metadata prefix is used to gather the page and link (bitstream) metadata. The bitstream URL, title, and bundle are all gathered from the ORE xml document.

Canonical Items

Starting with version 0.3, a new canonical items requirement was added. The list of bitstreams is checked for an additional bitstream in the ‘METADATA’ bundle that is named bitstream_metadata.xml. This file is then parsed for a repository url of a canonical version of the book as well as additional metadata about the page bitstreams so that missing pages can be marked for future reference/use. See the schema at docs/bitstream_metadata.xsd for more reference on the format.

History

1.0 (2011-07-07)

  • First official version
  • Compiled PDFs for frankenbooks and other copies
  • Bug and error fixes

0.9 (2011-03-29)

  • Switching to a ‘type’ field to support multiple book forms (canonical, franken, extant and work). South data and schema migrations provided.
  • Detailed pages view now supports a jp2 url in the request to select the current page
  • Added SWORD client configuration parameters to the repository
  • Switched URL to base_url and oai_path for OAI client to match the changes in the DSpace client
  • Added published flag to books
  • Added view to publish books over SWORD
  • Added ability to add books (except for extant books)

0.8 (2010-12-02)

  • Added South integration with the initial migration based on version 0.7
  • Added is_canonical attribute to accommodate feature requests
  • Updated CanonicalSelectionForm to use new is_canonical attribute
  • Tests for canonical flag in bitstream metadata files
  • Added a PageURLForm for editing only page urls
  • Added a URL for accessing the page editing view with the new PageURLForm
  • Added admin list_filter for collection when viewing books

0.7.1 (2010-11-15)

  • Bugfix: harvesting missing pages was creating too many pages

0.7 (2010-11-12)

  • Added a redirecting print view
  • Fixed print url to the new printable view
  • Reading view now enforces a more ‘reading’ like interaction: front cover, pages, then back cover
  • Cleaned up extra imports
  • Added sequence and internal fields to page form
  • Changed Exterior Page form to be page conversion form
  • Added external_views property to book objects
  • Changed edit-external url to be conversion url
  • Clarified harvesting code on missing books a bit
  • Added an exterior editor view, page adding view only sets page sequence if it isn’t in the incoming form
  • Harvesting pages without a bitstream_metadata.xml forces all pages to be internal
  • Modified the canonical selection form so that canonical link can be removed

0.6.1 (2010-11-02)

  • Bug fix for book pages view when no sequence/page set

0.6 (2010-11-02)

  • Added Exterior page form, Canonical selection form, and annotation form
  • Made canonical attribute of books be editable
  • Fixed holdover rtf->rft attribute bug in template tag
  • Added views/urls: book annotations, edit canonical, copy annotations, edit exterior page, add/edit/delete annoations
  • Added CSRF wrappers to posted views

0.5 (2010-10-27)

  • Added various editor views (edit pages, add page, edit page, order pages, delete page)
  • Added a page annotations view, fixed a bug with the page view
  • Added bitstream_metadata view to export bitstream_metadata files
  • Added a page form
  • Added necessary url configurations for new views
  • Added an Annotation model
  • Added an Internal flag to pages
  • Updates to harvesting and tests for newer bitstream_metadata format

0.4 (2010-09-29)

  • Add a setting for the url arguments for books to compare, BOOKREADER_COMPARISON_GET_ARGUMENT
  • Add a setting for the session key for storing books to compare, BOOKREADER_COMPARISON_SESSION_KEY
  • Add a setting for the template variable for the books to compare, BOOKREADER_COMPARISON_TEMPLATE_VARIABLE
  • Add a context processor for turning book id’s into books, prefers GET arguments over session variable for bookmarkability
  • Add template tags for adding/removing/retrieving get arguments for comparisons
  • Add views for adding to/removing from/clearing the comparison list
  • Add separate view for a comparison portlet
  • Updated the bitstream_metadata.xsd and the detailed page harvesting to match

0.3 (2010-09-14)

  • Added canonical field to Book model
  • Made jp2 url field on Page model optional (support for ‘missing’ pages)
  • Made title field on Page model optional (support for ‘missing’ pages)
  • Added python logging support with a default null handler
  • Added a parser for the bitstream metadata file
  • Added a custom lxml etree parser since the default is now to disable network
  • Switched loading pages and loading links signals to be on creation of books only

0.2 (2010-08-23)

  • Switching data model to one where books are items and the pages are just represented by bitstreams.
  • Harvesting of books in a collection
  • Harvesting of pages in a book
  • Signal for loading repository names from the repositories
  • Signal for loading collection name from the repository
  • Signal for loading books in a collection from the repository
  • Signal for loading book metadata from the repository
  • Signal for loading pages from the repository

0.1 (Unreleased)

  • Books and pages generated from Manakin views of a DSpace repository where pages are items and the bitstreams are the jpeg2000 files for the pages.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for bookreader, version 1.0
Filename, size File type Python version Upload date Hashes
Filename, size bookreader-1.0.tar.gz (42.2 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page