A fast and efficient file access library for local and cloud storage
Project description
FileStag
FileStag is a high-performance, unified file access library for Python. It provides a consistent API for reading, writing, and managing files across different storage backends—including the local filesystem, ZIP archives, HTTP/HTTPS URLs, and Azure Blob Storage.
FileStag is designed to be lightweight, efficient, and developer-friendly.
Key Features
- Unified API: Use the same
load,save, andcopymethods regardless of the underlying storage. - Protocol Support:
file://(Local filesystem)zip://(Direct access to files inside ZIP archives)http://&https://(Web fetching with built-in caching)azure://(Azure Blob Storage support)
- Smart Caching: Built-in web caching to speed up repeated remote file access.
- Async Support: Full
async/awaitsupport for high-concurrency applications. - Developer Friendly: Type-hinted, intuitive methods for text, JSON, and binary data.
Installation
Install FileStag via pip or poetry:
# Standard installation
pip install filestag
# With Azure Blob Storage support
pip install "filestag[azure]"
Using Poetry:
poetry add filestag
# or
poetry add filestag -E azure
Quick Start
Basic File Operations
FileStag makes file I/O simple and consistent.
from filestag import FileStag
# Save text to a local file
FileStag.save_text("hello.txt", "Hello World!")
# Load it back
content = FileStag.load_text("hello.txt")
print(content) # "Hello World!"
# Check if it exists
if FileStag.exists("hello.txt"):
print("File exists!")
JSON Handling
Read and write JSON with zero boilerplate.
data = {"name": "FileStag", "version": "0.1.0"}
# Save dictionary as JSON
FileStag.save_json("config.json", data, indent=4)
# Load JSON back into a dict
config = FileStag.load_json("config.json")
print(config["name"])
Working with ZIP Archives
Access files directly inside ZIP archives without manual extraction.
# Read a file directly from a ZIP archive
content = FileStag.load_text("zip://my_archive.zip/data/readme.txt")
# Check existence inside ZIP
if FileStag.exists("zip://my_archive.zip/images/logo.png"):
print("Logo found in archive")
Web Fetching & Caching
Download files from the web easily. Enable caching to prevent redundant network requests during development or analysis.
# Fetch a file from the web
data = FileStag.load("https://example.com/data.csv")
# Fetch with caching (cached for 1 hour)
# If the file was downloaded recently, it loads from disk instead.
data = FileStag.load(
"https://example.com/large_dataset.csv",
max_cache_age=3600
)
Async Support
Building a modern async application? FileStag has you covered.
import asyncio
from filestag import FileStag
async def main():
# Async load
text = await FileStag.load_text_async("data.txt")
# Async save
await FileStag.save_text_async("output.txt", "Processed data")
# Async web fetch
data = await FileStag.load_async("https://api.example.com/data.json")
asyncio.run(main())
Azure Blob Storage
(Requires filestag[azure] install)
Access Azure Blob Storage using connection strings or SAS URLs directly in the path.
from filestag import FileSource, FileSink
# Using a connection string with container and path
conn_string = (
"azure://DefaultEndpointsProtocol=https;"
"AccountName=myaccount;AccountKey=mykey...;"
"EndpointSuffix=core.windows.net/mycontainer/data"
)
# Iterate files from Azure
source = FileSource.from_source(conn_string)
for file in source:
print(f"{file.filename}: {len(file.data)} bytes")
# Upload files to Azure
sink = FileSink.with_target(conn_string)
sink.store("output.json", b'{"result": "success"}')
# Using a SAS URL (read-only access via shared link)
sas_url = "https://myaccount.blob.core.windows.net/container?sp=r&sig=..."
source = FileSource.from_source(sas_url, search_path="data/")
print(f"Found {len(source.file_list)} files")
Local File List Caching
When working with Azure containers containing thousands of files, listing operations can be slow. FileStag can cache the file list locally for instant access on subsequent runs:
# Cache the file list locally - huge speedup for large containers!
source = FileSource.from_source(
conn_string,
file_list_name="my_azure_cache.json",
)
# First run: fetches file list from Azure (may take seconds for 10k+ files)
# Subsequent runs: loads instantly from local cache file
print(f"Found {len(source.file_list)} files")
# Invalidate cache by bumping the version number:
source = FileSource.from_source(
conn_string,
file_list_name=("my_azure_cache.json", 2), # Change 2 -> 3 to invalidate
)
# Or force refresh programmatically (re-fetches and updates cache):
source.refresh()
# Automatic cache validation - checks if Azure has newer files:
source = FileSource.from_source(
conn_string,
file_list_name="my_azure_cache.json",
validate_cache=True, # Auto-refreshes if files changed on Azure
)
Environment Variable Placeholders
Keep credentials out of your code using {{env.VAR}} syntax:
# Credentials are substituted from environment variables
conn = "azure://...AccountName={{env.AZURE_ACCOUNT}};AccountKey={{env.AZURE_KEY}}..."
source = FileSource.from_source(conn + "/mycontainer")
Contributing
Contributions are welcome!
- Clone the repository
- Install dependencies:
poetry install - Run tests:
poetry run pytest - Check types:
poetry run mypy filestag
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file filestag-0.1.0.tar.gz.
File metadata
- Download URL: filestag-0.1.0.tar.gz
- Upload date:
- Size: 51.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.13.3 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b4ce2d4f948b850240d99a043f4401444f3636325bcb67bdc56e424bed37d91
|
|
| MD5 |
8939d8e0411a3b2a719254e08002f865
|
|
| BLAKE2b-256 |
67b545af367d34a10305ff6c67243d4b0506ea58336a4e2e15731082846e5edd
|
File details
Details for the file filestag-0.1.0-py3-none-any.whl.
File metadata
- Download URL: filestag-0.1.0-py3-none-any.whl
- Upload date:
- Size: 66.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.13.3 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b20b605eb19376ff05d200545607d342d45f142bb487cf2816f04619f996131
|
|
| MD5 |
c303be2669803a94fe26cf31c336d8f7
|
|
| BLAKE2b-256 |
510b3c7f1cea6335335201d257c3cde5d43f8329634dd4d24e66944355bb9739
|