Skip to main content

McCache is a, write through cluster aware, local in-memory caching library.

Project description

McCache for Python

Overview

McCache is a, write through cluster aware, local in-memory caching library that is build on Python's OrderedDict package. A local cache lookup is faster than retrieving it across a network. It uses UDP multicast as the transport hence the name "Multi-Cast Cache", playfully abbreviated to "McCache".

The goals of this package are:

  1. Reduce complexity by not be dependent on any external caching service such as memcached, redis or the likes. SEE: Distributed Cache
    • We are guided by the principal of first scaling up before scaling out.
  2. Keep the same Python programming experience. It is the same Python's dictionary interface. The distributed nature of the cache is transparent to you.
    • This is an in-process cache that is cluster aware.
  3. Performant
    • Need to handle rapid updates that are 0.01sec (10 ms) or faster.
  4. Secure
    • All transmissions across the network are encrypted.

McCache is not a replacement for your persistent or search data. It is intended to be used to cache your most expensive work. You can consider the Pareto Principle 80/20 rule, which states that caching 20% of the most frequently accessed 80% data can improve performance for most requests. This principle offers you the option to reduce your hardware requirement. Only you can decide how much to cache.

Installation

pip  install  mccache

Example

import  mccache
from    datetime  import  UTC
from    datetime  import  datetime  as  dt
from    pprint    import  pprint    as  pp

c = mccache.get_cache( 'demo' )
k = 'k1'

c[ k ] = dt.now( UTC )   # Insert a cache entry
print(f"Inserted on {c[ k ]}")

c[ k ] = dt.now( UTC )   # Update a cache entry
print(f"Updated  on {c[ k ]}")
print(f"Metadata for key '{k}' is {c.metadata[ k ]}")

del c[ k ] # Delete a cache entry
if  k  not in c:
    print(f" {k}  is not in the cache.")

k = 'k2'
c[ k ] = dt.now( UTC )   # Insert another cache entry
print(f"Inserted on {c[ k ]}")

# At this point all the cache with namespace 'demo' in the cluster are identical with just one entry with key 'k2'.

# Query the local cache checksum and metrics.
pp( mccache.get_local_checksum( 'demo' ))
pp( mccache.get_local_metrics(  'demo' ))

# Request the other members in the cluster to log out their local cache metrics.
mccache.get_cluster_metrics()

In the above example, there is nothing different in the usage of McCache from a regular Python dictionary. However, the benefit is in a clustered environment where the other subscribed member's cache are kept coherent with the changes to your local cache.

Guidelines

The following are some loose guidelines to help you assess if the McCache library is right for your project.

  • You have a need to not depend on external caching service.
  • You want to keep the programming consistency of a Python dictionary.
  • You have a small cluster of identically configured nodes.
  • You have a medium size set of objects to cache.
  • Your cached objects do not mutate frequently.
  • Your cached objects size is small.
  • Your cluster environment is secured by other means.
  • Your nodes clock in the cluster are well synchronized.

The adjectives used above have been intended to be loose and should be quantified to your environment and needs.
SEE: Testing

You can review the script used in the stress test.
SEE: Test script

You should clone this repo down and run the test in a local docker/podman cluster.
SEE: Contributing

We suggest the following testing to collect metrics of your application running in your environment.

  1. Import the McCache library into your project.
  2. Use it in your data access layer by populating and updating the cache but don't use the cached values.
  3. Configure to enable the debug logging by providing a path for your log file.
  4. Compare the retrieved values between your existing cache and from McCache.
  5. Run your application for an extended period and exit.
  6. Log the summary metric out for more extended analysis.
  7. Review the metrics to quantify the fit to your application and environment. SEE: Testing

Saving

Removing an external dependency in your architecture reduces it's complexity and not to mention some capital cost saving.
SEE: Cloud Savings

Configuration

The following are environment variables you can tune to fit your production environment needs.

Name Default Comment
MCCACHE_CACHE_TTL 3600 secs (1 hour) Maximum number of seconds a cached entry can live before eviction. Update operations shall reset the timer.
MCCACHE_CACHE_MAX 256 entries The maximum entries per cache.
MCCACHE_CACHE_MODE 1 The degree of keeping the cache coherent in the cluster.
0: Only members that has the same key in their cache shall be updated.
1: All members cache shall be kept fully coherent and synchronized.
MCCACHE_CACHE_SIZE 8,388,608 bytes (8Mb) The maximum in-memory size per cache.
MCCACHE_CACHE_PULSE 300 secs (5 min) The interval to send out a synchronization pulse operation to the other members in the cluster.
MCCACHE_CRYPTO_KEY The encryption/decryption key. Cryptography shall be enabled if presence of a key value. Generate the key as follows:
  from cryptography.fernet import Fernet
  print( Fernet.generate_key() )

Enabling this will increase the payload size by at least 30% and also increase CPU processing.
MCCACHE_PACKET_MTU 1472 bytes The size of the smallest transfer unit of the network packet between all the network interfaces.
Generally, ethernet frame is 1500 without the static 20 bytes IP and 8 bytes ICMP headers.
SEE: mccache.get_mtu()
MCCACHE_MULTICAST_IP 224.0.0.3 [ :4000 ] The multicast IP address and the optional port number for your group to multicast within.
SEE: IANA multicast addresses.
MCCACHE_MULTICAST_HOPS 1 hops The maximum network hops. 1 is just within the same switch/router. [>=1]
SEE: mccache.get_hops()
MCCACHE_CALLBACK_WIN 5 secs The window, in seconds, where the last lookup and the current change falls in to trigger a callback to a function provided by you.
MCCACHE_DAEMON_SLEEP 2 sec The snooze duration for the daemon housekeeper before waking up to check the state of the cache.
MCCACHE_LOG_FILENAME ./log/mccache.log The local filename where output log messages are appended to.
MCCACHE_LOG_FORMAT The custom logging format for your project.
SEE: Variables log_format and log_msgfmt in __init__.py
The following are parameters you can tune to fit your stress testing needs.
TEST_RANDOM_SEED 4th octet of the IP address The random seed for each different node in the test cluster.
TEST_KEY_ENTRIES 200 key/values The maximum of randomly generated keys.
The smaller the number, the higher the chance of cache collision. Tune this number down to add stress to the test.
TEST_DATA_SIZE_MIX 1 The data packet size mix.
1: Cache small objects where size < 1Kb.
2: Cache large objects where size > 9Kb.
3: Random mix of small and large objects.
Tune this number to 2 to add stress to the test.
TEST_RUN_DURATION 5 mins The duration in minutes of the testing run.
The larger the number, the longer the test run/duration. Tune this number up to add stress to the test.
TEST_APERTURE 0.01 sec The centerpoint of a range of durations to snooze within. e.g. For the value of 0.01, 10ms, the snooze range shall be between 6.5ms and 13.5ms. Tune this number down to add stress to the test.
TEST_MONKEY_TANTRUM 0 The percentage of drop packets.
The larger the number, the more unsent packets. Tune this number up to add stress to the test.

pyproject.toml

Specifying tuning parameters via pyproject.toml file.

[tool.mccache]
cache_ttl = 900
packet_mtu = 1472
multicast_hops = 3

Environment variables

Specifying tuning parameters via environment variables.

#  Unix
export MCCACHE_CACHE_TTL=900
export MCCACHE_PACKET_MTU=1472
export MCCACHE_MULTICAST_HOPS=3
::  Windows
SET MCCACHE_CACHE_TTL=900
SET MCCACHE_PACKET_MTU=1472
SET MCCACHE_MULTICAST_HOPS=3

Environment variables supersede the static setting in the pyproject.toml file.

Network check

Two utility methods are provided to assist you to determined the size of the MTU in your network and the number of network hops to the other members in the cluster. The following is an example to invoke these methods:

python -c "import mccache; mccache.get_mtu( '142.250.189.174')"
python -c "import mccache; mccache.get_hops('142.250.189.174')"

Public utility methods

# Factory method to get a cache instance.
def get_cache( name: str | None=None ,callback: FunctionType = _default_callback ) -> PyCache:

# Clear all the distributed caches.
def clear_cache( name: str | None = None ,node: str | None = None ) -> None:

# Get the maximum MTU between this and the another cluster member.
def get_mtu( ip_add: str ) -> None:

# Get the number of network hops between this and another cluster member.
def get_hops( ip_add: str ,max_hops: int | None = 20 ) -> None:

# Get the instance cache metrics from the current node.
def get_local_metrics( name: str | None = None ) -> dict:

# Get the instance cache checksum from the current node.
def get_local_checksum( name: str | None = None ,key: str | None = None ) -> dict:

# Request all members to output their metrics into their log.
def get_cluster_metrics( name: str | None = None ,node: str | None = None ) -> None:

# Request all members to output their cache checksum into their log..
def get_cluster_checksum( name: str | None = None ,key: str | None = None ,node: str | None = None ) -> None:

Design

Background Story

Releases

Releases are recorded here.

License

McCache is distributed under the terms of the MIT license.

Contribute

We welcome your contribution. Please read contributing to learn how to get setup to contribute to this project.

McCache is still a young project. With that said, please try it out in your applications: We need your feedback to fix the bugs and file down the rough edges.

Issues and feature request can be posted here. Help us port this library to other languages. The repos are setup under the GitHub McCache organization. You can reach our administrator at elau1004@netscape.net.

Support

For any inquiries, bug reports, or feature requests, please open an issue in the GitHub repository.

Miscellaneous

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mccache-0.4.12-py3-none-any.whl (51.2 kB view details)

Uploaded Python 3

File details

Details for the file mccache-0.4.12-py3-none-any.whl.

File metadata

  • Download URL: mccache-0.4.12-py3-none-any.whl
  • Upload date:
  • Size: 51.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mccache-0.4.12-py3-none-any.whl
Algorithm Hash digest
SHA256 d8ced16b67533080cbc5478c8c216124740d843eca4040457c1b05f506828fab
MD5 5b4feb46a43e3c6439603853b22b3d21
BLAKE2b-256 65eba77bd3209ba22fdfdf5cefad13349c82766af119ad333f1281dc81141c2d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page