Skip to main content

A simple database for storing metadata associated with (media) files.

Project description

MetaVault

MetaVault is a simple database for storing metadata associated with (media) files. This is a simple wrapper around sqlite3 which mimics the behavior of a Python dictionary. This is designed to provide an easy-to-use interface for managing metadata.

If you are trying to write alot of data iteratively to the database, make sure to set manual_commit=True in the database initialization and db.commit() at the end of your loop. This will reduce the number of commits and increase the speed of writing data to the database.

Installation

Install using pip:

pip install metavault

Usage

from metavault import MetaVaultDatabase

# connect to database

database = MetaVaultDatabase('test.vault')

# create dataset (or version of dataset)

database.create_dataset('test', attributes=['artist', 'title']) # preload attributes
database['test'] = {} # different way to create dataset with no attributes

# get dataset

dataset = database['test']

# add data to dataset

dataset["riddim.mp3"] = {"artist": "Bounty Killer", "title": "Riddim Killa"}
dataset["ambient.mp3"] = {"artist": "Dog The Bounty Hunter", "title": "Trashcore"}

# iterate

for item in dataset:
    print(f"- {item}") # - {'riddim.mp3': {'artist': 'Bounty Killer', 'title': 'Riddim Killa'}}

# acts like a dictionary

print(f"\n{dataset['riddim.mp3']}") # {'artist': 'Bounty Killer', 'title': 'Riddim Killa'}

# remove data

del dataset['riddim.mp3']
print(dataset.keys()) # ['ambient.mp3']

# remove attribute

dataset.remove_attribute('artist')
print(dataset['ambient.mp3']) # {'title': 'Trashcore'}

# add attribute

dataset.add_attribute('artist')
dataset['ambient.mp3']['artist'] = "Dog The Bounty Hunter"
print(dataset['ambient.mp3']) # {'title': 'Trashcore', 'artist': 'Dog The Bounty Hunter'}

database.close()

# or with context manager

with MetaVaultDatabase('test.vault') as database:
    dataset = database['test']
    # export as various formats
    dataset.export_data('test.csv')
    dataset.export_data('test.json')
    dataset.export_data('test.jsonl')
    # or import
    dataset.import_data('test.csv')
    dataset.import_data('test.json')
    dataset.import_data('test.jsonl')

# write alot of data with manual commit to improve performance

with MetaVaultDatabase('test.vault', manual_commit=True) as database:
    database.begin_transaction() # optional, makes a 'restore point' for rollback
    try:

        dataset = database['test']
        dataset.replace_in_attribute('artist', 'Bounty Killer', 'Bounty Killer 2024') # operations on entire set
        for data in datas:
            dataset[data['filename']] = data
        database.commit()

    except Exception as e:
        database.rollback() # rollback to last 'restore point' if available
        print(e)

# make subsets

with MetaVaultDatabase('test.vault') as database:
    subset = database['test'].search(artist='Bounty Killer')
    subset_2 = database['test'].get_subset_by_key(['ambient.mp3', 'riddim.mp3'])
    subset_3 = database['test'].get_subset_by_amount(25, start=5, reverse=True)
    subset_4 = database['test'].get_subset_by_random(25)
    subset_4.export_data('subset_4.csv') # export just a subset
    combined = subset + subset_2 + subset_3 + subset_4
    database['test_subset'] = combined

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

MetaVault-0.4.1.tar.gz (9.3 kB view details)

Uploaded Source

File details

Details for the file MetaVault-0.4.1.tar.gz.

File metadata

  • Download URL: MetaVault-0.4.1.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.9

File hashes

Hashes for MetaVault-0.4.1.tar.gz
Algorithm Hash digest
SHA256 f4a2315e008a00119f603c4c6254067a78f3a7a6910cf82ba1298baea6931bfe
MD5 283ae6dd9f10c2eaff1243917f8227b3
BLAKE2b-256 52515eee5a5ed1ef289d9abe6be6e312a1a9ad4c6ca236c9939f4d642d05ad03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page