Skip to main content

Python library to load DBGZ files

Project description

dbgz

Small utility to read and write data from/to dbgz files

Installation

Install using pip

pip install dbgz

or from source:

pip git+https://github.com/filipinascimento/dbgz.git

Usage

First import dbgz:

import dbgz

Defining a scheme

scheme = [
  ("anInteger","i"),
  ("aFloat","f"),
  ("aString","s"),
  ("anIntArray","I"),
  ("aFloatArray","F"),
  ("anStringArray","S"),
]

Writing some data to a dbgz file

from tqdm.auto import tqdm # Optional, to print progress bar
# pip install tqdm

totalCount = 1000000;
with dbgz.DBGZWriter("test.dbgz",scheme) as fd:
  # New entries can be added as:
  fd.write(anInteger=1, aString="1")
  fd.write(anInteger=2, aString="2", aFloat=5)
  fd.write(anInteger=3, aString="3",anIntArray=list(range(10)), aFloatArray=[0.1,0.2,0.3,0.5])

  # Here is a loop to write a lot of data:
  for index in tqdm(range(totalCount)):
    fd.write(
      anInteger=index,
      aFloat=index*0.01,
      anIntArray=list(range(index,index+10)),
      aString=str(index),
      aFloatArray=[index+0.1,index-0.2,index+0.3,index+0.4],
      anStringArray=[str(index),str(index+1),str(index+2),str(index+3)]
    )

Reading the dbgz file sequencially:

with dbgz.DBGZReader("test.dbgz") as fd:
  print(fd.scheme)
  for entry in tqdm(fd.entries,total=fd.entriesCount):
    assert entry["anInteger"] == int(entry["aString"])

Loading a dbgz file

with dbgz.DBGZReader("test.dbgz") as fd:
  pbar = tqdm(total=fd.entriesCount)
  print(fd.scheme)
  while True:
    entries = fd.read(10)
    if(not entries):
      break
    for entry in entries:
      assert entry["anInteger"] == int(entry["aString"])
    pbar.update(len(entries))
pbar.refresh()
pbar.close()

Saving dictionary to file and loading it again

with dbgz.DBGZReader("test.dbgz") as fd:
  indexDictionary = fd.generateIndex("anInteger",
    indicesPath=None,
    filterFunction=lambda entry: entry["anInteger"]<10,
    useDictionary=True,
    showProgressbar = True
    )
  for key,values in indexDictionary.items():
    print(key,values)
    for value in values:
      assert int(key) == fd.readAt(value)[0]["anInteger"]

Saving dictionary to file and loading it again

with dbgz.DBGZReader("test.dbgz") as fd:
  fd.generateIndex("anInteger",
    indicesPath="test_byAnInteger.idbgz", 
    filterFunction=lambda entry: entry["anInteger"]<10,
    useDictionary=True,
    showProgressbar = True
    )

  indexDictionary = dbgz.readIndicesDictionary("test_by.idbgz")
  for key,values in indexDictionary.items():
    print(key,values)
    for value in values:
      assert int(key) == fd.readAt(value)[0]["anInteger"]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbgz-0.4.2.tar.gz (18.0 kB view details)

Uploaded Source

File details

Details for the file dbgz-0.4.2.tar.gz.

File metadata

  • Download URL: dbgz-0.4.2.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for dbgz-0.4.2.tar.gz
Algorithm Hash digest
SHA256 25f2c6c20eb64df369295a28b5eaaee86d54cbe62b692a4826f5161023afa93a
MD5 ad73783a3c137c9d0eb7efaecb4d998b
BLAKE2b-256 251c4ade5de3d52334bd13d3024915ef69e81b1d04e01c7535284f3205c77b7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page