Skip to main content

Utilities working with Debian repository Contents files

Project description

pydebcontents: Searching Debian Contents files

Package repositories published by Debian (and its derivatives) have lots of different index files describing the Releases, Packages, Sources, and the file Contents of the packages. The Debian wiki has a full description of the repository format.

Access to the data within the Release, Packages, and Sources files is provided by the python-debian module, available within the Debian archive and from PyPI.

This module provides access to the Contents files.

Requirements

This module requires no Python modules outside of stdlib.

Searching the Contents files is, however, dependent on the external zgrep program being on your PATH; zgrep is used to transparently search the gzip-compressed Contents.gz files.

The Contents files need to be arranged as they would be found on a Debian mirror: dists/{release}/{component}/Contents-{arch}.gz.

Users of the apt-cacher-ng package might like to use its local file cache for access to the Contents files in the expected format.

Installation

From PyPI:

pip install pydebcontents

From git:

git clone https://salsa.debian.org/debian-irc-team/pydebcontents
cd pydebcontents
pip install .

Usage

The module comes with a simple command-line interface that feels a bit like the standard apt-file program.

For example, to find all the README files shipped in packages:

py-apt-file --base /var/cache/apt-cacher-ng/debrep/ search --mode glob  usr/share/doc/*/README

The only verb that py-apt-file knows at present is search.

$ py-apt-file search --help
usage: py-apt-file search [-h] [--release RELEASE] [--arch ARCH] [--component COMP] [--mode {glob,regex,fixed}]
[--max MAX]
PATTERN

positional arguments:
PATTERN               glob, regular expression or fixed string

options:
-h, --help            show this help message and exit
--release RELEASE     release to search (default: sid)
--arch ARCH, --architecture ARCH
architecture to search (default: amd64)
--component COMP      archive components to search (default: all of them)
--mode {glob,regex,fixed}
match mode for pattern
--max MAX             maximum number of packages to return

From Python, the module can be used as:

import pydebcontents

contents = pydebcontents.ContentsFile("/var/cache/apt-cacher-ng/debrep/", "sid", "amd64", ["contrib"])

contents.search("usr/share/doc/.*/README")

A ContentsDict structure is returned, which is a dict where the keys are package entries (in the {section}/{package} format used in the Contents files), and the values are lists of matching filenames.

The search term that ContentsFile.search uses is a str representation of a regular expression. There are convenience functions in pydebcontents for handling search patterns, including navigating some of the foibles of zgrep and the Contents file format:

  • glob2re converts glob syntax to regular expression
  • fixed2re converts a fixed string into a regular expression
  • re2re cleans up an existing regular expression
  • pattern2re is for programmatic use in selecting one of the above three functions.

To-do list / limitations

  • A previous attempt at a Python-only implementation was too slow to be usable for searching the Contents files; this could be revisited.
  • The mirrors are now carrying other compression formats such as xz that will not be found or used at present.
  • There is no utility provided to obtain the Contents files and arrange them on disk in a suitable tree.
  • There is no ability to simply point at a Contents file on-disk that is not in the desired tree format.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydebcontents-0.3.1.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

pydebcontents-0.3.1-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file pydebcontents-0.3.1.tar.gz.

File metadata

  • Download URL: pydebcontents-0.3.1.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for pydebcontents-0.3.1.tar.gz
Algorithm Hash digest
SHA256 d1fe3a8de2eb140d92b33f87d23c45908777004121feaa3a1fe2362449b58c46
MD5 cac241019a0b12e75df1d2bc593057f4
BLAKE2b-256 02d2b876e7b68a0ae424842f425a0fb946b855a9933f1f9fc6143b15be12f602

See more details on using hashes here.

File details

Details for the file pydebcontents-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pydebcontents-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 66046376c625e08eff59f2b4de52833328a93e6e1d41c3f0964b0d6772de03e6
MD5 dfa0de664ad82cdcd801ba6c32002047
BLAKE2b-256 fadb9da949ceda90e21b2a8f73c6ef43f52354f0adcfa8ccaf054ddf7439af8a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page