A professional data cleanup management library
Project description
Evolvishub Data Cleanup Adapter
A Python library for managing data cleanup and archiving in Evolvis applications.
About
This project is developed and maintained by Evolvis.ai.
Author
Alban Maxhuni, PhD
Email: a.maxhuni@evolvis.ai
Features
- Configurable data cleanup based on file age and size thresholds
- Support for both INI and YAML configuration files
- File system monitoring with automatic cleanup
- Configurable retention policies for different file types
- Automatic backup of cleaned files
- Cleanup of old backup files
- Comprehensive logging
- Thread-safe operations
- Asynchronous operations for better performance
Installation
pip install evolvishub-data-cleanup-adapter
Usage
- Create a configuration file (INI or YAML):
# config.ini
[folders]
data_folder1 = /path/to/folder1
data_folder2 = /path/to/folder2
[thresholds]
max_size_gb = 1.0
max_age_days = 30
[backup]
directory = /path/to/backup
max_age_days = 90
[monitoring]
check_interval_seconds = 3600
[retention]
policy_log = 7
policy_temp = 1
- Use the library in your code:
import asyncio
from evolvishub_datacleanup import DataCleanupManager
async def main():
try:
# Initialize the manager with your config file
manager = DataCleanupManager('config.ini')
# Start monitoring
await manager.start_monitoring()
# Example of concurrent operations
cleanup_task = asyncio.create_task(manager.cleanup_old_files())
backup_task = asyncio.create_task(manager.cleanup_backup_files())
# Wait for both operations to complete
await asyncio.gather(cleanup_task, backup_task)
# Get file information
file_info = await manager.get_file_info()
total_size = await manager.get_total_size()
# ... your application code ...
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Ensure monitoring is stopped
await manager.stop_monitoring()
if __name__ == "__main__":
asyncio.run(main())
Async Usage Guide
Basic Async Operations
All operations in the library are asynchronous and should be used with await:
# Single operation
await manager.cleanup_old_files()
# Multiple sequential operations
await manager.cleanup_old_files()
await manager.cleanup_backup_files()
Concurrent Operations
You can run multiple operations concurrently using asyncio.gather():
# Run multiple operations concurrently
await asyncio.gather(
manager.cleanup_old_files(),
manager.cleanup_backup_files(),
manager.get_file_info()
)
Error Handling
Always wrap async operations in try-except blocks:
try:
await manager.start_monitoring()
except Exception as e:
print(f"Failed to start monitoring: {e}")
Best Practices
-
Resource Management: Always ensure proper cleanup by using try-finally blocks:
try: await manager.start_monitoring() # ... your code ... finally: await manager.stop_monitoring()
-
Concurrent Operations: Use
asyncio.gather()for independent operations:results = await asyncio.gather( manager.get_total_size(), manager.get_file_info(), return_exceptions=True )
-
Cancellation: Handle task cancellation gracefully:
try: async with asyncio.timeout(30): # 30 second timeout await manager.cleanup_old_files() except asyncio.TimeoutError: print("Operation timed out")
-
Event Loop: Use
asyncio.run()as the main entry point:if __name__ == "__main__": asyncio.run(main())
Configuration
INI Format
[folders]
folder1 = /path/to/folder1
folder2 = /path/to/folder2
[thresholds]
max_size_gb = 1.0
max_age_days = 30
[backup]
directory = /path/to/backup
max_age_days = 90
[monitoring]
check_interval_seconds = 3600
[retention]
policy_log = 7
policy_temp = 1
YAML Format
data_folders:
- /path/to/folder1
- /path/to/folder2
cleanup_thresholds:
max_size_gb: 1.0
max_age_days: 30
backup_settings:
directory: /path/to/backup
max_backup_age_days: 90
monitoring_settings:
check_interval: 3600
retention_policies:
.log:
max_age_days: 7
.tmp:
max_age_days: 1
API Reference
DataCleanupManager
The main class for managing data cleanup operations.
manager = DataCleanupManager(config_path: Union[str, Path])
Methods
async start_monitoring(): Start monitoring data foldersasync stop_monitoring(): Stop monitoringasync cleanup_old_files(): Manually trigger cleanupasync cleanup_backup_files(): Clean up old backup filesasync get_total_size(): Get total size of monitored foldersasync get_file_info(): Get information about all files
Development
- Clone the repository:
git clone https://github.com/yourusername/evolvishub-datacleanup.git
cd evolvishub-datacleanup
- Install development dependencies:
pip install -e ".[dev]"
- Run tests:
pytest
Contributing
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file evolvishub_datacleanup-0.1.0.tar.gz.
File metadata
- Download URL: evolvishub_datacleanup-0.1.0.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2125cdbb80191f4bc5114a0cd126b3dfa13ec70426fea87054de8b39cb7e8f0f
|
|
| MD5 |
faa893b30ae8be9bb758af78bfbeb4ba
|
|
| BLAKE2b-256 |
964babde9aa3676445724ffea2dd7b144edb66391ea5291b17c0f587d6dd29da
|
File details
Details for the file evolvishub_datacleanup-0.1.0-py3-none-any.whl.
File metadata
- Download URL: evolvishub_datacleanup-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71deb6db51aac43fe9169accac7e94bab3c12b03b5ae8e11fc56376203005e3c
|
|
| MD5 |
e0e9ed1d5333b7d9414f7415270e5927
|
|
| BLAKE2b-256 |
094dc7b38503208eb51d28323a7df4493882744e666fa5cd33be6067049694a1
|