Skip to main content

Generate grpc API Service based on function

Project description

memfs - A Python Virtual File System in Memory

A Python module that implements a virtual file system in memory. This module provides an interface compatible with the standard os module and enables operations on files and directories stored in RAM rather than on disk.

Overview

memfs is designed to provide a fast, isolated file system environment for applications that need temporary file operations without the overhead of disk I/O. It's particularly useful for testing, data processing pipelines, and applications that need to manipulate files without affecting the host system.

Features

  • Complete in-memory file system implementation
  • API compatible with Python's standard os module
  • File and directory operations (create, read, write, delete, rename)
  • Path manipulation and traversal
  • File-like objects with context manager support
  • gRPC service generation for pipeline components
  • Encryption and compression support via extended filesystem
  • State persistence between CLI invocations
  • No disk I/O overhead
  • Isolated from the host file system

Installation

pip install memfs

Or install from source:

git clone https://github.com/pyfunc/memfs.git
cd memfs

For development setup:

# Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Linux/macOS
# .venv\Scripts\activate  # On Windows

# Install in development mode
pip install --upgrade pip
pip install setuptools wheel
pip install -e .

# Build the package
python -m build

Basic Usage Examples

Basic File Operations

from memfs import create_fs

# Create a file system instance
fs = create_fs()

# Write to a file
fs.writefile('/hello.txt', 'Hello, world!')

# Read from a file
content = fs.readfile('/hello.txt')
print(content)  # Outputs: Hello, world!

# Check if a file exists
if fs.exists('/hello.txt'):
    print('File exists!')

# Create directories
fs.makedirs('/path/to/directory')

# List directory contents
files = fs.listdir('/path/to')

Using File-Like Objects

from memfs import create_fs

fs = create_fs()

# Write using a file-like object
with fs.open('/data.txt', 'w') as f:
    f.write('Line 1\n')
    f.write('Line 2\n')

# Read using a file-like object
with fs.open('/data.txt', 'r') as f:
    for line in f:
        print(line.strip())

Directory Operations

from memfs import create_fs

fs = create_fs()

# Create nested directories
fs.makedirs('/a/b/c')

# Walk the directory tree
for root, dirs, files in fs.walk('/'):
    print(f"Directory: {root}")
    print(f"Subdirectories: {dirs}")
    print(f"Files: {files}")

Advanced Usage Examples

Data Processing Pipeline

from memfs import create_fs
import json
import csv

fs = create_fs()

# Create directories
fs.makedirs('/data/raw', exist_ok=True)
fs.makedirs('/data/processed', exist_ok=True)

# Write CSV data
with fs.open('/data/raw/input.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows([
        ['id', 'name', 'value'],
        [1, 'Alpha', 100],
        [2, 'Beta', 200]
    ])

# Process CSV to JSON
with fs.open('/data/raw/input.csv', 'r', newline='') as f:
    reader = csv.DictReader(f)
    data = [row for row in reader]

# Transform and save the data
for item in data:
    item['value'] = int(item['value'])
    item['double_value'] = item['value'] * 2

with fs.open('/data/processed/output.json', 'w') as f:
    json.dump(data, f, indent=2)

Parallel Processing

from memfs import create_fs
import json
import concurrent.futures

fs = create_fs()
fs.makedirs('/parallel/input', exist_ok=True)
fs.makedirs('/parallel/output', exist_ok=True)

# Create input files
for i in range(10):
    fs.writefile(f'/parallel/input/file_{i}.json', json.dumps({'id': i}))

def process_file(filename):
    with fs.open(f'/parallel/input/{filename}', 'r') as f:
        data = json.loads(f.read())
    
    # Process data
    data['processed'] = True
    
    with fs.open(f'/parallel/output/processed_{filename}', 'w') as f:
        f.write(json.dumps(data, indent=2))
    
    return data['id']

# Process files in parallel
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    futures = {executor.submit(process_file, f): f for f in fs.listdir('/parallel/input')}
    for future in concurrent.futures.as_completed(futures):
        file_id = future.result()
        print(f"Processed file ID: {file_id}")

Encrypted and Compressed Files

from memfs.examples.custom_filesystem import create_extended_fs

# Create extended filesystem with encryption and compression
fs = create_extended_fs()

# Write to an encrypted file
with fs.open_encrypted('/secret.txt', 'w', password='mysecret') as f:
    f.write('This is sensitive information')

# Read from an encrypted file
with fs.open_encrypted('/secret.txt', 'r', password='mysecret') as f:
    content = f.read()
    print(content)

# Write to a compressed file (good for large text)
with fs.open_compressed('/compressed.txt', 'w', compression_level=9) as f:
    f.write('This content will be compressed ' * 1000)

# Check the file sizes
normal_size = len(fs.readfile('/secret.txt'))
compressed_size = len(fs._FS_DATA['files']['/compressed.txt'])
print(f"Compression ratio: {normal_size / compressed_size:.2f}x")

gRPC Service Pipeline

from memfs import create_fs
from memfs.api import DynamicgRPCComponent, PipelineOrchestrator

# Define transformation functions
def transform_data(data):
    if isinstance(data, dict):
        data['transformed'] = True
    return data

def format_data(data):
    if isinstance(data, dict):
        data['formatted'] = True
    return data

# Create virtual directories
fs = create_fs()
fs.makedirs('/proto/transform', exist_ok=True)
fs.makedirs('/proto/format', exist_ok=True)
fs.makedirs('/generated/transform', exist_ok=True)
fs.makedirs('/generated/format', exist_ok=True)

# Create components
transform_component = DynamicgRPCComponent(
    transform_data,
    proto_dir="/proto/transform",
    generated_dir="/generated/transform",
    port=50051
)

format_component = DynamicgRPCComponent(
    format_data,
    proto_dir="/proto/format",
    generated_dir="/generated/format",
    port=50052
)

# Create and execute pipeline
pipeline = PipelineOrchestrator()
pipeline.add_component(transform_component)
pipeline.add_component(format_component)

result = pipeline.execute_pipeline({"input": "data"})
print(result)  # {"input": "data", "transformed": true, "formatted": true}

Command-line Interface

memfs provides a command-line interface for basic file operations. The CLI maintains state between invocations by default, storing filesystem data in ~/.memfs_state.json.

Basic CLI Usage

# Initialize a new filesystem (clears any existing state)
memfs init

# Display filesystem as a tree
memfs tree /

# Create a directory with parents
memfs mkdir -p /data/subdir

# Create an empty file
memfs touch /data/hello.txt

# Write content to a file
memfs write /data/hello.txt "Hello, virtual world!"

# Read file content
memfs read /data/hello.txt

# Dump filesystem content as JSON
memfs dump

Interactive Shell Mode

For a more interactive experience, you can use the shell mode:

memfs shell

This launches an interactive shell where you can run multiple commands without restarting the CLI:

memfs> mkdir -p /data
memfs> touch /data/hello.txt
memfs> write /data/hello.txt "Hello from shell mode!"
memfs> tree /
memfs> exit

CLI State Management

mkdir -p /data
touch /data/hello.txt
write /data/hello.txt "Hello from shell mode!"
tree /
exit

CLI State Management

The CLI stores state in ~/.memfs_state.json. If you're experiencing issues with state persistence:

# Check if state file exists
ls -la ~/.memfs_state.json

# Reset state by initializing a new filesystem
memfs init

# Or manually create an empty state file
echo '{"files": {}, "dirs": ["/"]}' > ~/.memfs_state.json

Creating a Custom CLI Command

You can create a custom script to use memfs in a single process:

#!/usr/bin/env python
from memfs import create_fs

fs = create_fs()
fs.makedirs('/data', exist_ok=True)
fs.writefile('/data/hello.txt', 'Hello, world!')

print("Filesystem contents:")
for root, dirs, files in fs.walk('/'):
    print(f"Directory: {root}")
    for d in dirs:
        print(f"  Dir: {d}")
    for f in files:
        print(f"  File: {f}")

Project Structure

memfs/
├── setup.py          # Package installation configuration
├── setup.cfg         # Setup configuration
├── README.md         # Project documentation
├── src/              # Source code
│   └── memfs/        # Main package
│       ├── __init__.py     # Basic component imports
│       ├── _version.py     # Version information
│       ├── memfs.py        # Virtual filesystem implementation
│       ├── api.py          # gRPC service generation module
│       └── cli.py          # Command-line interface
├── tests/            # Unit tests
│   ├── __init__.py
│   ├── test_memfs.py       # Tests for memfs module
│   └── test_api.py         # Tests for API module
└── examples/         # Usage examples
    ├── basic_usage.py      # Basic operations
    └── advanced_usage.py   # Advanced scenarios

API Reference

MemoryFS Class

  • open(path, mode='r') - Open a file
  • makedirs(path, exist_ok=False) - Create directories recursively
  • mkdir(path, mode=0o777) - Create a directory
  • exists(path) - Check if a path exists
  • isfile(path) - Check if a path is a file
  • isdir(path) - Check if a path is a directory
  • listdir(path) - List directory contents
  • walk(top) - Walk through directories recursively
  • remove(path) - Remove a file
  • rmdir(path) - Remove an empty directory
  • rename(src, dst) - Rename a file or directory
  • readfile(path) - Read an entire file
  • writefile(path, data) - Write data to a file
  • readfilebytes(path) - Read a file's contents as bytes
  • writefilebytes(path, data) - Write binary content to a file

Extended MemoryFS Class

  • open_encrypted(path, mode='r', password='') - Open an encrypted file
  • open_compressed(path, mode='r', compression_level=9) - Open a compressed file
  • set_metadata(path, metadata) - Set metadata for a file
  • get_metadata(path) - Get metadata for a file
  • find(pattern, start_path='/') - Find files matching a pattern
  • search_content(text, extensions=None, start_path='/') - Search for files containing text
  • backup(path, backup_dir='/backup') - Create a backup of a file or directory

API Module

  • DynamicgRPCComponent - Create a gRPC service from a function
  • PipelineOrchestrator - Orchestrate multiple components into a pipeline
  • ApiFuncConfig - Configuration for gRPC services
  • ApiFuncFramework - Framework for creating gRPC services

Use Cases

  • Unit testing - Test file operations without touching the disk
  • Data processing pipelines - Process data through multiple stages in memory
  • Microservices - Create gRPC services from Python functions
  • Sandboxed environments - Run file operations in an isolated environment
  • Performance optimization - Avoid disk I/O overhead for temporary operations
  • Secure storage - Encrypt sensitive data in memory
  • Containerized applications - Reduce container size by using in-memory storage

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memfs-0.1.8.tar.gz (39.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memfs-0.1.8-py3-none-any.whl (23.8 kB view details)

Uploaded Python 3

File details

Details for the file memfs-0.1.8.tar.gz.

File metadata

  • Download URL: memfs-0.1.8.tar.gz
  • Upload date:
  • Size: 39.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for memfs-0.1.8.tar.gz
Algorithm Hash digest
SHA256 39b303c4aa0decfb4b37f68f7715a3b7af42be111abca193aab7fbe6c452bd41
MD5 119666f778946302f00c486989f973a2
BLAKE2b-256 4c154be63d147dcf04b650c0212c0cc06cef19beb17724048e1e49d31343f3fb

See more details on using hashes here.

File details

Details for the file memfs-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: memfs-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 23.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for memfs-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 2c49d5a8ee044c97fd38920e9f4eda70fd55d0aa142628f34224de6350779a01
MD5 ffd310fbd13b127da3ea54dc30e13588
BLAKE2b-256 b2ee44b481e24c95d4b6a5dcbf0db3423017edd288e2d99953bec74cac87d94c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page