Skip to main content

Generate CDXJ TimeMaps for testing elsewhere

Project description

CDXJ Generator

A Python script to generate CDXJ TimeMaps for testing elsewhere.

Install

This tool is published to pypi. To install it:

pip install cdxjGenerator

To use the development version, clone this repository then pip install .

Usage

These inststructions assume installation via pip.

To run:

cdxjGenerator [number of lines] [URI-R]

For example:

cdxjGenerator 12

...will generate CDXJ output (to stdout by default) consisting of entries for 12 random URIs. Alternatively:

cdxjGenerator 25000 memento.us

...will generate 25,000 entries for the URI-R memento.us. This output can be written to a file like:

cdxjGenerator 25000 memento.us > sample.cdxj

The resulting file will likely need to be sorted before used elsewhere. Do this via:

LC_ALL=C sort sample.cdxj > sample_sorted.cdxj

This can also be performed in a single command, instead of writing to the temporary, unsorted sample.cdxj like:

cdxjGenerator 25000 memento.us | LC_ALL=C sort > sample_sorted.cdxj

Background

TimeMaps are lists that enumerate URIs of resources that encapsulate prior states of the given resource. (RFC7089 - Memento). TimeMaps are often expressed in an extension of the Web Linking (RFC5988) format. Additional, less common formats, like JSON and CDXJ TimeMaps can also express the same information in a less rigid format. CDXJ is the most flexible of the three and is used by InterPlanetary Wayback (ipwb), which sparked the initial need for this software existing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdxjGenerator-0.0.1.tar.gz (3.2 kB view details)

Uploaded Source

Built Distribution

cdxjGenerator-0.0.1-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file cdxjGenerator-0.0.1.tar.gz.

File metadata

  • Download URL: cdxjGenerator-0.0.1.tar.gz
  • Upload date:
  • Size: 3.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.7

File hashes

Hashes for cdxjGenerator-0.0.1.tar.gz
Algorithm Hash digest
SHA256 4498af152c7f0a1c947dade5d5591f7db308dcbc464a4daa3de50516cd80675a
MD5 639fe460d1215439cec46c3f10600c16
BLAKE2b-256 2636f595cb3fa679ea3bfaa83014e40835a4e655747cf701d0a03d9be3525223

See more details on using hashes here.

File details

Details for the file cdxjGenerator-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: cdxjGenerator-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.7

File hashes

Hashes for cdxjGenerator-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 024c1f22965b14f8da634374c311eb54130fd48421a73bbdbe7afa4cfbe8a38a
MD5 a64c8e094f84f3f4563c37a6dd6b76e4
BLAKE2b-256 a5bcbccc58af2ebdd7cb10ad4ea32b7968a10b2e8d0b919e515812839a900475

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page