Python library to work with ARC and WARC files
Project description
WARC (Web ARChive) is a file format for storing web crawls.
This warc library makes it very easy to work with WARC files.:
import warc
f = warc.open("test.warc")
for record in f:
print record['WARC-Target-URI'], record['Content-Length']
Documentation
The documentation of the warc library is available at http://warc.readthedocs.org/.
License
This software is licensed under GPL v2. See LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
warc-0.2.1.tar.gz
(18.4 kB
view details)
File details
Details for the file warc-0.2.1.tar.gz.
File metadata
- Download URL: warc-0.2.1.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65ec3336287ae7a17c969736935ba188678df10f2ec813d8e3474cc51bb71d39
|
|
| MD5 |
3235a8b68e28c77d45227b2850654776
|
|
| BLAKE2b-256 |
9ab430d87239ec30cd0c504bd7dec9cd22b51ef0cbb00d6fbbc138b1ddcfc108
|