Skip to main content
Help improve PyPI by participating in a 5-minute user interface survey!

Transform unstructured document collections to structured Linked Data

Project Description

Ferenda is a python library and framework for transforming unstructured document collections into structured Linked Data. It helps with downloading documents, parsing them to add explicit semantic structure and RDF-based metadata, finding relationships between documents, and publishing the results, including through a REST-based HTTP API.

https://badge.fury.io/py/ferenda.png https://travis-ci.org/staffanm/ferenda.png?branch=master https://ci.appveyor.com/api/projects/status/aqdo3c39cdof8opa/branch/master https://coveralls.io/repos/staffanm/ferenda/badge.png?branch=master Code Health https://pypip.in/d/ferenda/badge.png

Quick start

This example uses ferenda’s project framework to download the 50 latest RFCs and W3C standards, parse documents into structured, RDF-enabled XHTML documents, loads all RDF metadata into a triplestore and generates a web site of static HTML5 files that are usable offline:

pip install ferenda
ferenda-setup myproject
cd myproject
./ferenda-build.py ferenda.sources.tech.RFC enable
./ferenda-build.py ferenda.sources.tech.W3Standards enable
./ferenda-build.py all all --downloadmax=50 --staticsite --fulltextindex=False
open data/index.html

The same functionality can also be accessed through a python API, if you want to use ferenda as part of a larger system. It’s also possible to just use the parts of ferenda that you need (eg. only the downloading and parsing features).

More information

See http://ferenda.readthedocs.org/ for in-depth documentation.

Release history Release notifications

This version
History Node

0.3.0

History Node

0.2.0

History Node

0.1.7

History Node

0.1.6.1

History Node

0.1.6

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
ferenda-0.3.0-py2.py3-none-any.whl (842.6 kB) Copy SHA256 hash SHA256 Wheel py2.py3 Feb 18, 2015
ferenda-0.3.0.tar.gz (835.8 kB) Copy SHA256 hash SHA256 Source None Feb 18, 2015

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page