Library for reading and writing Map With Tree files
Project description
Map With Tree
A Python library for reading and writing compressed, sorted key-value stores with efficient lookup using a B-tree-like structure.
Features
- Efficient Storage: Data is compressed using zstandard (zstd) with configurable compression levels
- Fast Lookups: Built-in B-tree index for O(log n) key lookups
- Sorted Keys: Keys are automatically sorted during finalization for efficient traversal
- Data Integrity: MD5 hash of all entries is computed and stored for verification
- Flexible Types: Support for multiple key and value types including bytes, strings, integers, floats, and JSON
- Memory Efficient: Block-based compression and caching minimize memory usage
- Simple API: Pythonic interface with context managers and dict-like operations
Installation
pip install map_with_tree
Requires Python >= 3.8 and zstd >= 1.5.
Quick Start
Writing Data
import map_with_tree
# Create a new map file with string values
with map_with_tree.open("data.mwt", "w", values_type="string") as writer:
writer.add_entry(b"key_1", "value_1")
writer.add_entry(b"key_2", "value_2")
writer.add_entry(b"key_3", "value_3")
Reading Data
import map_with_tree
# Open and read from a map file
with map_with_tree.open("data.mwt") as reader:
# Get a value by key
value = reader[b"key_1"]
# Check if key exists
if b"key_2" in reader:
print("Key exists!")
# Get with default value
value = reader.get(b"key_99", default="not found")
# Iterate over all entries
for key, value in reader:
print(f"{key}: {value}")
# Get file metadata
print(f"Total entries: {len(reader)}")
print(f"Header: {reader.header}")
API Reference
Opening Files
map_with_tree.open(path, mode="r", **kwargs)
path: File path for the map filemode:"r"for reading,"w"for writing**kwargs: Additional options for writing (see Writer Options)
Writer Options
MapWithTreeWriter(
path,
header=None, # Custom header metadata (dict)
keys_type="bytes", # Type for keys
values_type="bytes", # Type for values
keys_per_node=128, # Number of keys per B-tree node
block_size=64*1024, # Block size for compression (64KB default)
compression_level=3 # zstd compression level (0-22)
)
Supported Types
- bytes: Raw bytes (default)
- string or str: UTF-8 encoded strings
- int or i64: 64-bit signed integer
- uint or u64: 64-bit unsigned integer
- i8, i16, i32: Signed integers (8, 16, 32 bit)
- u8, u16, u32: Unsigned integers (8, 16, 32 bit)
- float or f64: 64-bit float
- f32: 32-bit float
- json: JSON-serializable objects
- struct:format: Custom struct format (e.g.,
"struct:<IIf"for two unsigned ints and a float)
Writer Methods
writer.add_entry(key, value) # Add a key-value pair
writer.finalize() # Finalize the file (called automatically on context exit)
writer.close() # Close file handles
Reader Methods
reader[key] # Get value by key (raises KeyError if not found)
reader.get(key, default=None) # Get value with default
key in reader # Check if key exists
len(reader) # Get number of entries
reader.header # Access header metadata
reader.close() # Close file handle
File Format
Map With Tree (.mwt) files consist of:
- Magic Header: 8-byte signature (
mwt\0\0\0\0\1) - Header Offset: 8-byte pointer to compressed header
- Data Blocks: Compressed blocks of values with zstd
- B-tree Index: Tree structure for efficient key lookup and MD5 hash
The format ensures:
- Sequential writes for optimal I/O performance
- Minimal memory usage during both reading and writing
- Fast random access through the B-tree index
- Efficient compression with block-level granularity
- Data integrity verification through hashing
- Efficient compression with block-level granularity
Examples
Large Dataset
import map_with_tree
import uuid
# Write 100,000 entries
with map_with_tree.open("large.mwt", "w", values_type="string") as writer:
for i in range(100000):
key = f"key_{i:06d}".encode()
value = uuid.uuid4().hex
writer.add_entry(key, value)
# Read and check compression
with map_with_tree.open("large.mwt") as reader:
print(f"Entries: {len(reader)}")
uncompressed = reader.header["uncompressed_size"]
compressed = reader.header["compressed_size"]
print(f"Compression ratio: {compressed / uncompressed:.2%}")
Structured Data with JSON
import map_with_tree
with map_with_tree.open("users.mwt", "w", keys_type="string", values_type="json") as writer:
writer.add_entry("user_1", {"name": "Alice", "age": 30, "city": "NYC"})
writer.add_entry("user_2", {"name": "Bob", "age": 25, "city": "SF"})
with map_with_tree.open("users.mwt") as reader:
user = reader[b"user_1"]
print(f"{user['name']} is {user['age']} years old")
Custom Struct Types
import map_with_tree
# Values are tuples of (unsigned int, unsigned int, float)
with map_with_tree.open("metrics.mwt", "w", values_type="struct:<IIf") as writer:
writer.add_entry(b"metric_1", (100, 200, 3.14))
writer.add_entry(b"metric_2", (150, 250, 2.71))
with map_with_tree.open("metrics.mwt") as reader:
count1, count2, ratio = reader[b"metric_1"]
print(f"Counts: {count1}, {count2}, Ratio: {ratio}")
Performance Tips
- Adjust block size: Larger blocks (e.g., 256KB) improve compression but use more memory
- Tune compression level: Lower levels (1-3) for speed, higher (10-22) for size
- Choose appropriate types: Use native types (int, float) instead of strings when possible
- Batch writes: Add all entries before finalizing to ensure optimal tree structure
- Keys per node: Increase for larger datasets to reduce tree height
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
map_with_tree-0.0.1.tar.gz
(10.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file map_with_tree-0.0.1.tar.gz.
File metadata
- Download URL: map_with_tree-0.0.1.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
472b2d12fe5b499cc48b7b281a33f514213c65fcb65d55c9119acc4688e46732
|
|
| MD5 |
2089172859f8117e1febef53bf58973e
|
|
| BLAKE2b-256 |
0e0e3379c6fa6fd38a96d36c537eb52d9fc0d22332bf98022770d16ce316a22c
|
File details
Details for the file map_with_tree-0.0.1-py3-none-any.whl.
File metadata
- Download URL: map_with_tree-0.0.1-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0781320dc89bad5f2aff04df6173ab09439845804a8d1548c67099b6daf6816
|
|
| MD5 |
bc1b5c90f02ae541f7a394a1245ceadc
|
|
| BLAKE2b-256 |
f04a5bc381e50ca86f279076681cd2f9bf56e41298e7beb2bfbc8290407569d2
|