Sqlite based cache for python projects

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
Natural Language
- English
Programming Language

Project description

dinkycache for python projects

A very small and flexible name/value cache for python.

Intended for quick set up, in development and small scale projects.

Uses sqlite for storage and lzstring for compression.

Stores any data that can be parsed in to a string with json.dumps() and json.loads()
Returns int, dict and str just fine, but returns a list if supplied a tuple

Install

From pip

python -m pip install dinkycache

From github

# Copy the 'dinkycache' directory and 
# requirements.txt into your cwd
# Install dependencies:
python -m pip install -r requirements.txt

How to use

Import

from dinkycache import Dinky

3 main methods called like so:

Dinky().read(str: id)
Dinky().write(str: id, str:data, float:ttl)
Dinky().delete(str: id) -> int

Examples with default settings

from dinkycache import Dinky

#gets data from some slow source
def fetch_data(id):
    return "some data"

id = "001"

Then where you would normaly write:

results = fetch_data(id)

Write these two lines instead:

if (result := Dinky().read(id) == False):
    Dinky().write(id, result := fetch_data(id))

If you are running Python < 3.8 or just don't like walruses:

results = Dinky().read(id)
if results == False:
    results = fetch_data(id)
    Dinky().write(id, results)

This is also an option, its there, its fully supported, however not further documented:

    #Write:
    d = Dinky()
    d.id = "test"
    d.data = {"whatever": "floats"}
    d.setTTL(24) #hr
    d.write()
    print(d.data)

    #Read:
    d = Dinky()
    d.id = "test
    results = d.read()
    print(results)

In either case results will contain the data from cache if its there and within the specified TTL. Or it will call your get_some_data() to try and fetch the data instead.

Settings

Avaialble settings and default values

    dbfile: str = "dinkycache.db",  # name of sqlite file
    ttl: float = 2160,              # time to live in hours, default 2160 = 90 days, 0 = no expiry
    purge_rows: bool = True,        # will enforce row_limit if true
    row_limit: int = 10000,         # maximum number of rows in db
    row_overflow: int = 1000,       # buffer zone above row_limit before anything is deleted
    clean_expired: bool = True,     # will delete outdated entries if true
    clean_hrs: int = 24,            # time between cleanups of expried entries
    clean_iterations: int = 100,    # iterations (reads/writes) between cleanups

Set them in one of the following ways

# Positional arguments:
Dinky('preferred.db', 24).read(id)

# Keyword arguments:
Dinky(dbfile='preferred.db').read(id)

# Unpack list as positional arguments:
settings = ['preferred.db', 24]
Dinky(*settings).read(id)

# Unpack dict as keyword arguments:
settings = {
    'dbfile' = 'preferred.db',
    'ttl' = 24,
}
Dinky(**settings).read(id)

Examples of use with user-defined settings

You can destruct a dict an pass it as settings each time you invoke Dinky(**settings), or assign the new Dinky object to a variable and re-use it that way:

Invoke on every use:

settings = {
    'dbfile' = 'preferred.db',
    'purge_rows' = True,
    'clean_expired' = False,
    'row_limit' = 100,
    'ttl' = 0,
}

if (result := Dinky(**settings).read(id) == False):
    Dinky(**settings).write(id, result := fetch_data(id))

Retain Dinky object:

d = Dinky(
    dbfile = 'preferred.db',
    purge_rows = True,
    clean_expired = False,
    row_limit = 100,
    ttl = 0,
)

if (result := d.read(id) == False):
    d.write(id, result := fetch_data(id))

clean_expired, clean_hrs and clean_iterations

If clean_expired = True, script will try to clean out expired entries every time data is written if one of the following conditions are met.
It has been minimum clean_hrs: int = 24 hours since last cleanup
OR
There have been more than clean_iterations: int = 100 calls since last cleanup

The cleanup function comes at a 75% performance cost, so if it runs on every 100 write, that amounts to a 7.5% average performance cost.

clean_expired might therefore be a much better alternative than using purge_rows for larger amounts of data.

purge_rows, row_limit and row_overflow

If purge_rows = True, script will try to clean out overflowing lines every time data is written.
row_limit = int sets the maximum lines in the database.
row_overflow = int how many lines over row_limit before row_limitis enforced

This comes at a great performance cost for larger databases. 462 ms to sort 100k rows on a 1.8 ghz Intel Core i5. For this reason row_overflow is added as a buffer threshold, so that deletion dont happen on every call to .write.

It is probably best used for small databases and/or databases with small entries.

Public methods

.read()

Arguments id (string, required)
Returns data corresponding to id, or False if there is no data
Can be called without arguments on existing object if id has alredy been set.

.write()

Arguments id (string, required), data (string, required), tll (int, optional)
Stores the supplied data to that id, tll can be set here if not already passed on invocation
Returns the hashed id or False
Will do clean_expiredand purge_rows if they are set True
Can be called without arguments on existing object if id and data has alredy been set.

.delete()

Arguments id (string, required)
Deletes entry corresponding to that id
Returns number of rows deletet, 1 or 0.
Can be called without arguments on existing object if id has alredy been set.

Setting TTL to seconds, months etc:

Its been a bit of a discussion whats the most sensible choice of default time unit for TTL, we landed on hours as a float.

The idea is that hours will be sufficient and sensible enough in most usercases. However it also allows for a workaround if you need to set lower or higher values:

    10 seconds:
    Dinky().write(id = "1", data = "some data", ttl = 10 / 3600)
    10 minutes:
    Dinky().write(id = "1", data = "some data", ttl = 10 / 60)
    10 months:
    Dinky().write(id = "1", data = "some data", ttl = 10 * 720)
    10 years:
    Dinky().write(id = "1", data = "some data", ttl = 10 * 8760)

Alternatively you can expiry in seconds directly on the Dinky object like so

    d = Dinky()
    d.expires = 10 + d.now
    d.write(id = "1", data = "some data")

Performance

This wont ever compete with Redis, MongoDB or anything like that. This is ment to be a small, easy solution for small scale use cases where you dont want or need any big dependencies. Hence performance will be less, but might still be orders of magnitude faster than repeatedly parsing the data from some website.

Tests:

Reads from DB 1

10k entries of 40 to 1500 characters:

1 read = 0.018 to 0.003 sec
100 reads = 0.6 sec (0.006 avg)

Reads from DB 2

10k entries of 40 to 150000 characters:
1 read = 0.003 to 0.022 sec
100 reads = 1.1 to 2.4 sec (0.015 avg)

Test DB 3:

38.1mb: 100k writes str_len 40~1500: avg 11.3ms (incl generation)

10k reads: 6.57 ms avg

Security

Ids are hashed, so you may put anything in there Data is compressed to a string of base 64 characters, so you may put anything in there.

Lzstring seem to have very high integrity, we have not been able to produce a test result where the input and output has not been equal.

That said, what you put in is what you'll get out. There is no checking for html-tags and such. Just something to bevare of if for some reason you'll use it to store and later display user provided data.

Compression

Lzstring is not great for shorter strings, and does sometimes even increase to string lenght. However in testing we found that short strings (80 to 1500 chars) have an average compression rate of 98%, while strings longer than 60000 characters have an average compression rate of 48%. Testing was done with random as well as real world data.

So there is most likely some performance loss, but it is outweighed by smaller database files and the fact that base 64 strings makes life very easy.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

This version

1.0.3

Aug 24, 2022

1.0.2

Jul 11, 2022

1.0.1

Jul 11, 2022

0.9.1

Jul 11, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dinkycache-1.0.3.tar.gz (7.5 kB view details)

Uploaded Aug 24, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dinkycache-1.0.3-py3-none-any.whl (7.2 kB view details)

Uploaded Aug 24, 2022 Python 3

File details

Details for the file dinkycache-1.0.3.tar.gz.

File metadata

Download URL: dinkycache-1.0.3.tar.gz
Upload date: Aug 24, 2022
Size: 7.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for dinkycache-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`5d6fbbcccc96da76bf4d4feaff36746ab538a782214a37b427d7779e70a9d067`
MD5	`23ee0c458de93f24a87e661414d6c411`
BLAKE2b-256	`322c5016485628ed6e6367e33ca79165ea9d6da3c5bbaf301575351ae104cc57`

See more details on using hashes here.

File details

Details for the file dinkycache-1.0.3-py3-none-any.whl.

File metadata

Download URL: dinkycache-1.0.3-py3-none-any.whl
Upload date: Aug 24, 2022
Size: 7.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for dinkycache-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8f7e9327649ab529f0cd44352c988d65425072cc0020c3971f34a58a0116907f`
MD5	`59cf4f9652a2e03de62441802dfc4de8`
BLAKE2b-256	`73c556036b7054347ef4cd101913ca4845948ffcf5cf6fb3a50245c0461c65e0`

See more details on using hashes here.

dinkycache 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dinkycache for python projects

Install

How to use

Examples with default settings

Settings

Examples of use with user-defined settings

Invoke on every use:

Retain Dinky object:

clean_expired, clean_hrs and clean_iterations

purge_rows, row_limit and row_overflow

Public methods

.read()

.write()

.delete()

Setting TTL to seconds, months etc:

Performance

Tests:

Security

Compression

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes