Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

SKIMMR library and scipts for machine-aided skim-reading (the general-purpose package version for arbitrary texts)

Project Description
SKIMMR_GT - a Brief Overview

Contact for more detailed info


This document provides basic information about SKIMMR (a tool for
machine-aided skim reading), in particular about its SKIMMR_GT package version
that focuses on general texts.

The document contains three sections:

1. ABOUT - overview of the tool and its functionalities

2. INSTALLATION - basic instructions on how to install SKIMMR

3. USING SKIMMR - basic instruction on how to use it after installation


SKIMMR is a research prototype aimed at helping users to navigate through
large amounts of textual data efficiently. This is done by extending the
traditional paradigm of searching and browsing of text collections. SKIMMR
lets the user skim texts by navigating a network of concepts and relations
explicitly or implicitly present in them. The concepts and their relations
have been extracted and inferred from the textual content using novel machine
reading techniques that power the SKIMMR back-end.

The interconnected `skimming networks' provide a high-level overview of the
domain covered by the texts, and let the user quickly discover interesting
pieces of information. This process also largely reduces the burden of
sieving through lots of irrelevant resources, which is often the down side
of using the standard search engines. When the users identify interesting
information within the high level overview, they can continue reading the
related textual resources in detail.


Probably the easiest way of installing SKIMMR is using easy_install:

*easy_install skimmr_gt*

Check the documentation at
for more detailed info on easy_install and setuptools.

Should you prefer to download and install the package manually, fetch the
SKIMMR distribution archive file first. After unpacking it, switch to the
generated directory and execute the following command:

*python install*

If you want to install the package locally (for the current user only), use
the following:

*python install --user*

Check the documentation at for
more detailed options.


After downloading and installing the SKIMMR package, it can be readily used
in a basic manner through the scripts provided. These are:

- - extraction of co-occurrence statements from texts

- - creation of a knowledge base and its population by semantic
similarity relations

- - indexing of the knowledge base for efficient querying

- - preparation of the sub-folder structure and resources in the
working directory necessary in order to launch the SKIMMR server

- - launching the SKIMMR server and UI

The scripts are located in the *bin* subdirectory of the installation package.
Alternatively, you can copy them from wherever your system puts Python package
binaries (check the documentation of your local operating system and Python

The typical way of using these scripts is summarised in the following sections.
Note that there are other ways how to launch the scripts - you can check the
documentation in the script source code for details.

3.1 Creating the working directory

First of all, SKIMMR requires a place to store and process its data. Create a
directory for that somewhere (let us assume it is called *skimmr* in the
following). Switch to that directory then and copy all the SKIMMR scripts
there. After that, run


in there. This will generate two sub-directories, *data* and *text*, as well
as a couple of files and directories deeper in the *data* one. You are all set
for loading the texts you want to process into SKIMMR then.

3.2 Processing the texts

Copy the text files you want to process to the *text* folder in the *skimmr*
directory. Plain text files (in ASCII or Unicode format) are supported, with
the *.txt* extensions. It is advisable to use meaningful and unique filenames
for the text files, as they will be used later on for assigning the provenance
identifiers to the original text data.

After you have all the texts in place, run


in the *skimmr* directory. This will chop up the texts into paragraphs and
extract the co-occurrence statements from them. There is a limit imposed
on the number of produced statements in the script, dynamically
computed from the available memory (or set to 750,000 if the psutil package
is not available on your system). You can change that when using the SKIMMR
library functions directly.

3.3 Creating the knowledge base

After generating the co-occurrence statements in the previous step, you can
create the knowledge base from them using

*python create*

which will generate a couple of knowledge base persistence files in the *stre*
sub-folder of the *data* directory in the *skimmr* root folder.

3.4 Computing similarities

When the knowledge base has been generated, you can augment it by computing
semantic similarity relationships between the terms that are more frequent
than average:

*python compsim*

This will update the knowledge base persistence files accordingly. Note that
this step may take up to several hours for larger knowledge bases!

3.5 Indexing the knowledge base

Before you can expose the processed content via a SKIMMR web interface, you
have to index the knowledge base. This is done by running


which will generate a couple of index files in the knowledge base persistence

3.6 Launching and using the server

At last, you can launch the SKIMMR server by


This will start the server at localhost ( and port 8008. You can
specify alternative addresses and ports by running the server as


Also, you can specify an alternative store to be loaded by the server (useful
if you want to examine multiple stores you have previously generated):


where FOLDER is a path to the store you want to load.

After the SKIMMR server has been started, you can point you browse to the
corresponding address and port and start using the tool as indicated in the
*About* web-page accessible from the SKIMMR interface (just follow the link
in the bottom of every page in the SKIMMR web interface).
Release History

Release History

This version
History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
skimmr_gt-0.1-a2.tar.gz (659.0 kB) Copy SHA256 Checksum SHA256 Source Feb 15, 2013

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting