A simple database for storing metadata associated with (media) files.
Project description
MetaVault
MetaVault is a simple database for storing metadata associated with (media) files. This is a simple wrapper around sqlite3 which mimics the behavior of a Python dictionary. This is designed to provide an easy-to-use interface for managing metadata.
If you are trying to write alot of data iteratively to the database, make sure to set manual_commit=True in the database initialization and db.commit() at the end of your loop. This will reduce the number of commits and increase the speed of writing data to the database.
Installation
Install using pip:
pip install metavault
Usage
from metavault import MetaVaultDatabase
# connect to database
database = MetaVaultDatabase('test.vault')
# create dataset (or version of dataset)
database.create_dataset('test', attributes=['artist', 'title']) # preload attributes
database['test'] = {} # different way to create dataset with no attributes
# get dataset
dataset = database['test']
# add data to dataset
dataset["riddim.mp3"] = {"artist": "Bounty Killer", "title": "Riddim Killa"}
dataset["ambient.mp3"] = {"artist": "Dog The Bounty Hunter", "title": "Trashcore"}
# iterate
for item in dataset:
print(f"- {item}") # - {'riddim.mp3': {'artist': 'Bounty Killer', 'title': 'Riddim Killa'}}
# acts like a dictionary
print(f"\n{dataset['riddim.mp3']}") # {'artist': 'Bounty Killer', 'title': 'Riddim Killa'}
# remove data
del dataset['riddim.mp3']
print(dataset.keys()) # ['ambient.mp3']
# remove attribute
dataset.remove_attribute('artist')
print(dataset['ambient.mp3']) # {'title': 'Trashcore'}
# add attribute
dataset.add_attribute('artist')
dataset['ambient.mp3']['artist'] = "Dog The Bounty Hunter"
print(dataset['ambient.mp3']) # {'title': 'Trashcore', 'artist': 'Dog The Bounty Hunter'}
database.close()
# or with context manager
with MetaVaultDatabase('test.vault') as database:
dataset = database['test']
# export as various formats
dataset.export_data('test.csv')
dataset.export_data('test.json')
dataset.export_data('test.jsonl')
# or import
dataset.import_data('test.csv')
dataset.import_data('test.json')
dataset.import_data('test.jsonl')
# write alot of data with manual commit to improve performance
with MetaVaultDatabase('test.vault', manual_commit=True) as database:
database.begin_transaction() # optional, makes a 'restore point' for rollback
try:
dataset = database['test']
dataset.replace_in_attribute('artist', 'Bounty Killer', 'Bounty Killer 2024') # operations on entire set
for data in datas:
dataset[data['filename']] = data
database.commit()
except Exception as e:
database.rollback() # rollback to last 'restore point' if available
print(e)
# make subsets
with MetaVaultDatabase('test.vault') as database:
subset = database['test'].search(artist='Bounty Killer')
subset_2 = database['test'].get_subset_by_key(['ambient.mp3', 'riddim.mp3'])
subset_3 = database['test'].get_subset_by_amount(25, start=5, reverse=True)
subset_4 = database['test'].get_subset_by_random(25)
subset_4.export_data('subset_4.csv') # export just a subset
combined = subset + subset_2 + subset_3 + subset_4
database['test_subset'] = combined
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file MetaVault-0.2.tar.gz.
File metadata
- Download URL: MetaVault-0.2.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
109925d15e69eae0b63cf5d0d0eaa26ee0fde7d7836afa656fdc9d2a66368db9
|
|
| MD5 |
580ffd208866828a3dd8e955164811f0
|
|
| BLAKE2b-256 |
6cf286a736d670e8d449ee111895f844164d2e89cbed9f1d269be3bf4e9bd22b
|