A Django book reading application that utilizes Djatoka and a DSpace repository to present browseable views of scanned books.
This is a book reading UI application for items that have been deposited in a DSpace repository that has the OAI-PMH repository views enabled.
Using the bookreader
Install the bookreader Python package by doing one of the following:
Adding it to your list of eggs in buildout.cfg and running bin/buildout
Use easy_install bookreader to install it globally
Download the source code, unpack it and run python setup.py install
In your project’s settings.py, add bookreader to the list of INSTALLED_APPS and run manage.py syncdb to add the tables to the database. Then include bookreader.urls in your url patterns. Last, you will need to add the base URL for your Djatoka server in your settings file as DJATOKA_BASE_URL.
The bookreader uses the Django admin interface for the manipulation of the data. First, you must add a Repository. The only input required is the OAI-PMH server url of your DSpace repository. Using that url, Django will connect to the server and query for its name. Then collections can be added by DSpace handle. Once a handle is entered, Django will query the server to validate that it is a valid handle and query for the name of the collection. At this point, it will also query for all books (items) in the collection and harvest the metadata and pages for those books. This automatic behavior can be disabled by setting BOOKREADER_SIGNALS_ENABLED to False in your settings.
Item and Bitstream Layout
For the use case of this Django UI, books are DSpace items and pages are bitstreams attached to those book items. The page bitstreams must be JPEG2000 files in the ‘ORIGINAL’ or main content bundle. Thumbnails can be provided for each page by adding a jpeg file in a ‘THUMBNAILS’ bundle that has the same base filename as the original page bitstream. For example, if the page bitstream filename is tamu_pl_0001.jpf then the thumbnail must end with tamu_pl_0001.jpg. In this case thumbs/tamu_pl_0001.jpg would also be acceptable.
In order for the data be harvested, the DSpace OAI-PMH server must have the ORE and DIM metadata prefixes enabled and compatible crosswalks installed for the books.
The DIM metadata prefix is used to gather the book (item) metadata. The fields are harvested using the pyoai MetadataReader that is extended in the Python dspace library, NestedMetadataReader. XPath evaluators are used to map xml elements to fields. For the current mapping, see bookreader.harvesting.metadata.dim_reader.
The ORE metadata prefix is used to gather the page and link (bitstream) metadata. The bitstream URL, title, and bundle are all gathered from the ORE xml document.
Starting with version 0.3, a new canonical items requirement was added. The list of bitstreams is checked for an additional bitstream in the ‘METADATA’ bundle that is named bitstream_metadata.xml. This file is then parsed for a repository url of a canonical version of the book as well as additional metadata about the page bitstreams so that missing pages can be marked for future reference/use. See the schema at docs/bitstream_metadata.xsd for more reference on the format.
First official version
Compiled PDFs for frankenbooks and other copies
Bug and error fixes
Switching to a ‘type’ field to support multiple book forms (canonical, franken, extant and work). South data and schema migrations provided.
Detailed pages view now supports a jp2 url in the request to select the current page
Added SWORD client configuration parameters to the repository
Switched URL to base_url and oai_path for OAI client to match the changes in the DSpace client
Added published flag to books
Added view to publish books over SWORD
Added ability to add books (except for extant books)
Added South integration with the initial migration based on version 0.7
Added is_canonical attribute to accommodate feature requests
Updated CanonicalSelectionForm to use new is_canonical attribute
Tests for canonical flag in bitstream metadata files
Added a PageURLForm for editing only page urls
Added a URL for accessing the page editing view with the new PageURLForm
Added admin list_filter for collection when viewing books
Bugfix: harvesting missing pages was creating too many pages
Added a redirecting print view
Fixed print url to the new printable view
Reading view now enforces a more ‘reading’ like interaction: front cover, pages, then back cover
Cleaned up extra imports
Added sequence and internal fields to page form
Changed Exterior Page form to be page conversion form
Added external_views property to book objects
Changed edit-external url to be conversion url
Clarified harvesting code on missing books a bit
Added an exterior editor view, page adding view only sets page sequence if it isn’t in the incoming form
Harvesting pages without a bitstream_metadata.xml forces all pages to be internal
Modified the canonical selection form so that canonical link can be removed
Bug fix for book pages view when no sequence/page set
Added Exterior page form, Canonical selection form, and annotation form
Made canonical attribute of books be editable
Fixed holdover rtf->rft attribute bug in template tag
Added views/urls: book annotations, edit canonical, copy annotations, edit exterior page, add/edit/delete annoations
Added CSRF wrappers to posted views
Added various editor views (edit pages, add page, edit page, order pages, delete page)
Added a page annotations view, fixed a bug with the page view
Added bitstream_metadata view to export bitstream_metadata files
Added a page form
Added necessary url configurations for new views
Added an Annotation model
Added an Internal flag to pages
Updates to harvesting and tests for newer bitstream_metadata format
Add a setting for the url arguments for books to compare, BOOKREADER_COMPARISON_GET_ARGUMENT
Add a setting for the session key for storing books to compare, BOOKREADER_COMPARISON_SESSION_KEY
Add a setting for the template variable for the books to compare, BOOKREADER_COMPARISON_TEMPLATE_VARIABLE
Add a context processor for turning book id’s into books, prefers GET arguments over session variable for bookmarkability
Add template tags for adding/removing/retrieving get arguments for comparisons
Add views for adding to/removing from/clearing the comparison list
Add separate view for a comparison portlet
Updated the bitstream_metadata.xsd and the detailed page harvesting to match
Added canonical field to Book model
Made jp2 url field on Page model optional (support for ‘missing’ pages)
Made title field on Page model optional (support for ‘missing’ pages)
Added python logging support with a default null handler
Added a parser for the bitstream metadata file
Added a custom lxml etree parser since the default is now to disable network
Switched loading pages and loading links signals to be on creation of books only
Switching data model to one where books are items and the pages are just represented by bitstreams.
Harvesting of books in a collection
Harvesting of pages in a book
Signal for loading repository names from the repositories
Signal for loading collection name from the repository
Signal for loading books in a collection from the repository
Signal for loading book metadata from the repository
Signal for loading pages from the repository
Books and pages generated from Manakin views of a DSpace repository where pages are items and the bitstreams are the jpeg2000 files for the pages.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.