Simple package for cached URL fetching
URLReader is a simple Python package to download documents using HTTP. Unlike directly using urllib2 it will cache the document for future uses (in the same process or in others using optional filesystem-based cache).
The following applies to all Python code in this package:
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.
Docstrings and comments from the package source code included in the documentation are dual-licensed under GNU Affero General Public license version 3 or later and GNU Free Documentation License version 1.3 or later.
- Python 2.6, 2.7, 3.1 or later
Documentation is compiled using Sphinx.
For development these tools are recommended and called by the included Makefile:
- coverage (version 3.4b1 or newer)
Installing and updating using easy_install
Numbered releases of URLReader can be installed or updated using the following command:
easy_install -U URLReader
Optional dependencies should be installed separately, since they are not required for every possible use of the package.
Obtaining newer source
URLReader uses the Mercurial version control system to store its source code. Mercurial is available in many GNU/Linux distributions as a package named mercurial.
To get the source of URLReader, use the following command:
hg clone http://hg.mtjm.eu/urlreader/
It will make a urlreader directory, a different name can be specified as an additional argument to the hg clone command.
To update it, run hg pull -u in this directory.
Installing from source
Just run the following command:
python setup.py install
setup.py has many options listed using the --help option. To install in non-default directory, use the --prefix option to the install command.
Before installing it is encouraged to use the following command to test parts of URLReader which do not make HTTP requests to external servers:
python setup.py test
All tests should pass.
Documentation is formatted by the following command:
python setup.py build_sphinx
An optional argument of -b can be used to specify different format than HTML.