Skip to main content

This package allows the pdftextsplitter engine to communicate with a Django-database

Project description

djangotextsplitter

This package is meant as a django-extension for the pdftextsplitter package. As such, the pdftextsplitter package should be installed before this package.

This django-extension provides an out-of-the-box django-app with database models for the python-classes in pdftextsplitter. As such, it becomes possible to store the results of the pdftextsplitter package in the django database.

The django-application in this package does not contain any views, urls, templates, static files or any other functionality. Only database models (including admin-registration) and load/write functions. These models and load/write functions can then be used in other django applications, together with the pdftextsplitter engine.

Installation works like: pip install djangopdftextsplitter

List of database models

The database models in this application are:

  • textpart (corresponds to the textpart-class from the pdftextsplitter-package)
  • fontregion (corresponds to the fonregion-class from the pdftextsplitter-package)
  • lineregion (corresponds to the lineregion-class from the pdftextsplitter-package)
  • readingline (needed to store certain information, but does not have an equivalent in the pdftextsplitter-package)
  • readinghistogram (needed to store certain information, but does not have an equivalent in the pdftextsplitter-package)
  • title (corresponds to the title-class from the pdftextsplitter-package)
  • body (corresponds to the body-class from the pdftextsplitter-package)
  • footer (corresponds to the footer-class from the pdftextsplitter-package)
  • headlines (corresponds to the headlines-class from the pdftextsplitter-package)
  • headlines_hierarchy (needed to store certain information, but does not have an equivalent in the pdftextsplitter-package)
  • enumeration (corresponds to the enumeration-class from the pdftextsplitter-package)
  • enumeration_hierarchy (needed to store certain information, but does not have an equivalent in the pdftextsplitter-package)
  • textsplitter (corresponds to the textsplitter-class from the pdftextsplitter-package)
  • native_toc_element (corresponds to the native_toc_element-class from the pdftextsplitter-package)
  • breakdown_decision (needed to store certain information, but does not have an equivalent in the pdftextsplitter-package)
  • textalinea (corresponds to the textalinea-class from the pdftextsplitter-package)

Getting started

Within a django-environment (if the djangotextsplitter is installed in the virtual environment and registered in the django), one can simpy have access to the model by calling
from djangotextsplitter.models import textsplitter as db_textsplitter
We recommend using the 'as db_' to distinguish django database models from base classes in the pdftextsplitter-package.
Loading/writing operations can be accessed with:
from djangotextsplitter.loads import load_textsplitter
Each model that has an associated class in pdftextsplitter, has a load-function, a newwrite-function, an overwrite-function and a delete-function.
They can be called as:
from pdftextsplitter import textsplitter
from djangotextsplitter.models import textsplitter as db_textsplitter
from djangotextsplitter.loads import load_textsplitter
from djangotextsplitter.newwrites import newwrite_textsplitter
from djangotextsplitter.overwrites import overwrite_textsplitter
from djangotextsplitter.deletes import delete_textsplitter
mysplitter = load_textsplitter(31) # 31 is database primary key; in django the pk
db_splitter = newwrite_textsplitter(mysplitter) # No need for a key here, as it is appended to the list
db_splitter = overwrite_textsplitter(31,mysplitter) # 31 is database primary key; in django the pk
delete_textsplitter(31) # 31 is database primary key; in django the pk

For further details, we refer the user to the documentation of pdftextsplitter, or to the mode details documentation in the docs-folder of this package.
djangotextsplitter is not very complicated. It just provides the database models and load/newwrite/overwrite/delete functions to the pdftextsplitter package, so the pdftextsplitter package can be efficiently used from within a django webapplication.

Permissions

The admin registration of the models is done in such a way that only superusers have access to the models in the admin function, even if other users have admin-access and the permissions to view/add/change/delete them. This is done to enforce people to only change the models using the load/newwrite/overwrite/delete functions. If someone would manually change the structure of the models somewhere in the hierarchy, this could cause major disruptions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

djangotextsplitter-1.2.1.tar.gz (560.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

djangotextsplitter-1.2.1-py3-none-any.whl (61.8 kB view details)

Uploaded Python 3

File details

Details for the file djangotextsplitter-1.2.1.tar.gz.

File metadata

  • Download URL: djangotextsplitter-1.2.1.tar.gz
  • Upload date:
  • Size: 560.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for djangotextsplitter-1.2.1.tar.gz
Algorithm Hash digest
SHA256 0f8b87a76b10676381256d25117ef1545775fdee735cfa64af8fc5bc0a47ce25
MD5 9fa606e96f2f104e4b92fe82667c465c
BLAKE2b-256 5fb75dcd1c08fb8f9aad431fba267e9aa91755d6b0860114f6c34713b9509cf6

See more details on using hashes here.

File details

Details for the file djangotextsplitter-1.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for djangotextsplitter-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0ab89b6762a814e01395ce0745268fdf0f4e1821052ab13d638d628187068c6e
MD5 14fca6251461a3e3d11938b02c79d22e
BLAKE2b-256 447cb434dfcf1f4db95d9da2537ad96a2587dc984a979753dff7ed9b0a56ff34

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page