Transform unstructured document collections to structured Linked Data
Project description
Ferenda is a python library and framework for transforming unstructured document collections into structured Linked Data. It helps with downloading documents, parsing them to add explicit semantic structure and RDF-based metadata, finding relationships between documents, and publishing the results, including through a REST-based HTTP API.
Quick start
This example uses ferenda’s project framework to download the 50 latest RFCs and W3C standards, parse documents into structured, RDF-enabled XHTML documents, loads all RDF metadata into a triplestore and generates a web site of static HTML5 files that are usable offline:
pip install ferenda ferenda-setup myproject cd myproject ./ferenda-build.py ferenda.sources.tech.RFC enable ./ferenda-build.py ferenda.sources.tech.W3Standards enable ./ferenda-build.py all all --downloadmax=50 --staticsite --fulltextindex=False open data/index.html
The same functionality can also be accessed through a python API, if you want to use ferenda as part of a larger system. It’s also possible to just use the parts of ferenda that you need (eg. only the downloading and parsing features).
More information
See http://ferenda.readthedocs.org/ for in-depth documentation.
Copyright and license
Most of the code written by Staffan Malmgren, licensed under the main 2-clause BSD license.
Some bundled code and other creative works are written by other authors, included in accordance with their respective licenses:
- rdflib-sqlite by Graham Higgins, BSD
- patch by Anatoly Techtonik, MIT
- Grit XSLT stylesheets and RDL service UI by Niklas Lindstrom, BSD
- httpheader by Deron Meranda, LGPL
- smc.mw by Marcus Brinkmann, BSD
- normalize.css, MIT
- responsive template, PD
- jquery , MIT
- modernizr, MIT
- respond.js, MIT/GPL
- Gentleface wireframe toolbar icons, CC-BY-NC
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size ferenda-0.3.0-py2.py3-none-any.whl (842.6 kB) | File type Wheel | Python version py2.py3 | Upload date | Hashes View |
Filename, size ferenda-0.3.0.tar.gz (835.8 kB) | File type Source | Python version None | Upload date | Hashes View |
Hashes for ferenda-0.3.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 889d1eb5ac17a61cb0aa946a3c566ae9998f1a1eac8ea17dd4e87c164f1c052a |
|
MD5 | f76c3bb402a268478293dd70b5d38bbb |
|
BLAKE2-256 | ac1ca608e8fb4d830ceb4247759d5ac67c92c4b5df9a7b0b9226b9b839ff887c |