Skip to main content

Utilities to tag files

Project description

metaindex

metaindex allows you to find files based on metadata information.

For example, if you want to find all pictures that are have a certain width, you could do this:

metaindex find mimetype:image resolution:1200x

The following file formats are supported out of the box (although they might need additional python packages, see <#Installation>):

  • images (png, jpg, etc.; whatever is supported by Pillow)
  • audio (mp3, m4a, ogg, etc.; whatever is supported by mutagen)
  • OpenDocument (odt, ods, etc.)
  • Office Open XML (docx, pptx, xlsx)
  • pdf
  • html
  • epub
  • abc music notation
  • cbz (through ComicInfo.xml)
  • gpx
  • filetags in the style of Karl Voit's filetags

Installation

To install metaindex either install it directly through pypi:

pip install metaindex

Or clone the repository and install that then through pip:

git clone https://codeberg.org/vonshednob/metaindex
cd metaindex
pip install .

Most modules are optional. If you, for example, want to use metaindex for audio files and PDFs, you will have to install it like this:

pip install metaindex[pdf,audio]

or, for the cloned repository:

pip install .[pdf,audio]

These modules exist for indexing:

  • pdf, for PDF files,
  • audio, any type of audio/music file,
  • image, any type of image file,
  • video, any type of video file (overlaps somewhat with audio),
  • ebook, ebooks and comic book formats,
  • xdg, support for XDG (if you use Linux, just add it),
  • yaml, extra metadata in YAML format,
  • ocr, find and extract text from images with tesseract (you must have tesseract installed for this to work).

In case you just want everything, this is your install command:

pip install .[all]

There is also an experimental FuseFS filesystem. To be able to use it, you will have to specify fuse as an additional module:

pip install .[all,fuse]

Server dependencies

If you just want to connect to another instance of the metaindex server, you are ready to go.
More likely though you will have to install Xapian and its Python3 bindings. Please follow the usual way of your OS to install both.

For example, on Archlinux you'd pacman -S xapian python-xapian. On debian-likes it would be apt install python3-xapian.

Usage

Before you can use metaindex to search for files, you have to initialize the cache by telling it where your files to index are, for example:

metaindex index --recursive --index ~/Pictures

Afterwards you can start searching for files by metadata, like this:

metaindex find

Searching

Search queries for use with metaindex find allow you to search

  • for files that have a metadata tag: metaindex find resolution:
  • for files that have a metadata tag with a certain value: metaindex find title:"dude, where is my car"
  • for files that have any metadata tag with a certain value: metaindex find "just anything"

Each value that you provide is actually a case insensitive regular expression.

Usage from Python

To use the metaindex infrastructure from Python, you should instantiate a Cache and run queries against it (with find).

Cache.find will return an iterable of CacheEntry instances, consisting of

  • path, the location in the file system where that file was last seen
  • metadata, a multidict of all metadata
  • last_modified, the timestamp when the file was last modified on disk (to the knowledge of the cache)

You can just iterate over the CacheEntry instances to get their tag, value tuples.

To use the user's preferences, it's a good idea to load their configuration. Here's an example snippet that'll do both things:

    from metaindex.configuration import load
    from metaindex import Cache

    config = load()
    cache = Cache(config)

    searchquery = 'mimetype:image'

    for entry in cache.find(searchquery):
        print(entry.path)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaindex-2.3.0.tar.gz (417.2 kB view details)

Uploaded Source

Built Distribution

metaindex-2.3.0-py3-none-any.whl (121.6 kB view details)

Uploaded Python 3

File details

Details for the file metaindex-2.3.0.tar.gz.

File metadata

  • Download URL: metaindex-2.3.0.tar.gz
  • Upload date:
  • Size: 417.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for metaindex-2.3.0.tar.gz
Algorithm Hash digest
SHA256 ec6007f74559017747a2003d26134c3cfab02de76c7ba7a02123b0e9818b6f13
MD5 014eb8cf83a5a81169e1edae35f38e73
BLAKE2b-256 9bdf91e6261ab10134d62e221fa72f39e0c5699b0fe6858d7340a49f22f91725

See more details on using hashes here.

File details

Details for the file metaindex-2.3.0-py3-none-any.whl.

File metadata

  • Download URL: metaindex-2.3.0-py3-none-any.whl
  • Upload date:
  • Size: 121.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for metaindex-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f250d9e0e44c612493ad603c5fb01b4fa79deae68cc3e49359822bbd7f6433c0
MD5 a2fbf8d027d15de9dff926b2b281d9c1
BLAKE2b-256 f364fd2bb11812da24ae9fc01e3e602f30bfad43ecb13c9e5659962cc09131fc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page