Skip to main content

An automated daemon for file management/extraction and file behavior analysis.

Project description

Filefly

Python License PyPI version

A real-time file automation daemon written in Python that monitors, classifies, and organizes files as they are downloaded. Filefly combines rule-based routing with adaptive behavior to safely handle a wide range of file types. It is designed to handle real-world edge cases such as incomplete downloads, race conditions, and conflicting file operations.

๐Ÿ““ Development logs documenting the full build process are available in the logs directory, under the devlogs subfolder.

Features

Core Functionality

  • Real-time monitoring of multiple directories
  • Rule-based file routing by extension
  • Automatic archive extraction (ZIP, TAR, 7z)

Reliability

  • Safe-move logic to prevent overwrites
  • Download stabilization to avoid partial file handling
  • Detection of manual file moves and deletions

Observability

  • Real-time runtime_status.json
  • Persistent logging via filefly.log (stored in logs/runtime)
  • Lightweight web dashboard (Flask)

Screenshots

Terminal

Terminal screenshot

Web Dashboard

Dashboard screenshot

Installation

Lightweight Installation (Core Daemon)

Installs only the core file automation system:

pip install filefly-files

Includes:

  • Real-time file monitoring
  • Rule-based routing
  • Archive extraction
  • Logging and dashboard

Full Installation (With Analysis Tools)

Installs Filefly with additional data analysis capabilities:

pip install "filefly-files[analysis]"

Includes everything in the core version, plus:

  • File activity analytics
  • Data visualization (matplotlib)
  • Statistical analysis tools (pandas)
  • Telemetry-based insights

Choosing an Installation Mode

  • Use lightweight if you only want automated file organization
  • Use full installation if you want to analyze file behavior and trends over time

All Python dependencies will install automatically.

Running Filefly

Option 1 โ€” Run the daemon via terminal

After installation, Filefly should provide a command-line entry point:

filefly

(Installed automatically via the CLI entry point)

This launches the background daemon and begins monitoring the configured folders.

Option 2 โ€” Run using Python directly

Exact equivalent to above:

python -m filefly

Configuration

On first run, Filefly generates a configuration file:

  • Linux/macOS: ~/.config/filefly/config.json
  • Local project: filefly/config.json

Configuration Structure

config.json

{
    "watch_folders": ["~/Downloads"],
    "extensions": {
        ".zip": "~/Documents/Archives",
        ".pdf": "~/Documents/PDFs"
    },
    "temp_extensions": [".crdownload", ".part", ".tmp"]
}

Users may edit this file to customize behavior via three parameters: watch_folders, extensions, and temp_extensions.

  • watch_folders: directories on the watchlist
  • extensions: file types to reroute
  • temp_extensions: temporary download formats to ignore

Waiting for temporary files to stabilize before processing them prevents race conditions during downloads.

For local setups, config.json typically lives right next to main.py.

The system can adapt to certain file behaviors over time and handle unexpected cases gracefully, using logged events to improve reliability.

File events are processed according to the rules defined in config.json. If a fileโ€™s extension is recognized, it is routed to the configured destination; otherwise, it is ignored. Filefly is designed to handle unexpected errors and edge cases gracefully. All events and errors are recorded in filefly.log (in the runtime subfolder of the logs folder) for traceability and debugging.

The extension routing works as a hand-in-hand communication with main.py and config.json. When main.py detects a downloading file of a certain extension, it's sent to config.json to see if it's recognized. If it's not, then the file is skipped over, but if it is, then main.py will find a matching extension and take the file to the desired folder based on its extension-folder dictionary in config.json.

Example entry:

{
    "watch_folders": ["~/Downloads"],
    "extensions": {
        ".zip": "~/Documents/Archives",
        ".pdf": "~/Documents/PDFs"
    },
    "temp_extensions": [".crdownload", ".part", ".tmp"]
}

Accidentally deleted your config file? Don't worry - Filefly automatically regenerates a new one if it runs and doesn't start off with one.

Optional: Start Filefly at login (OS-specific)

macOS

brew services restart filefly

(or a custom plist file)

Linux (systemd)

systemctl --user enable filefly
systemctl --user start filefly

Windows

Create a Task Scheduler task pointing to:

python -m filefly

Verify installation

To ensure Filefly is installed correctly:

python -c "import filefly; print(filefly.__version__)"

Upgrading

pip install --upgrade filefly-files

Uninstalling

pip uninstall filefly-files

How It Works

  1. Filefly monitors configured directories
  2. When a file appears:
    • waits for the file to stabilize
    • determines its extension
    • checks routing rules
  3. If matched:
    • moves file to destination
    • extracts archives if needed
  4. Logs all actions for traceability

Future Work

Filefly is currently focused on reliable, rule-based file automation. Future development will expand its adaptability, performance, and analytical capabilities.

Smarter Classification

  • Move beyond extension-based routing toward content-aware handling
  • Explore heuristic and pattern-based classification for ambiguous files
  • Investigate lightweight learning approaches for recurring user behaviors

Behavioral Analysis & Insights

  • Store structured file event data using SQLite for long-term analysis
  • Analyze trends in file types, sizes, and user interactions over time
  • Visualize file activity patterns using matplotlib (e.g., frequency, distribution, growth)
  • Use collected data to inform smarter routing and automation decisions

Performance & Scalability

  • Optimize handling of large directories with high file throughput
  • Improve event processing efficiency and reduce redundant operations
  • Introduce batching or prioritization strategies for heavy workloads

Reliability Improvements

  • Strengthen handling of edge cases such as interrupted writes and rapid file changes
  • Expand safeguards around file conflicts and concurrent operations
  • Enhance logging for deeper debugging and traceability

Dashboard & Observability

  • Expand the web dashboard with richer file event insights
  • Add filtering, search, and historical views for logs
  • Integrate visual analytics directly into the dashboard interface

Extensibility

  • Introduce a plugin or rule-extension system for custom behaviors
  • Allow users to define more advanced routing logic beyond extensions

Project Structure

filefly/
โ”œโ”€โ”€ assets/
โ”‚   โ”œโ”€โ”€ filefly_processing_vs_size.png
โ”‚   โ”œโ”€โ”€ filefly_terminal.png
โ”‚   โ””โ”€โ”€ filefly_web_dashboard.png
โ”œโ”€โ”€ logs/
โ”‚   โ”œโ”€โ”€ devlogs/
โ”‚   โ”‚   โ”œโ”€โ”€ v0.2.0-alpha.md
โ”‚   โ”‚   โ”œโ”€โ”€ week1.md
โ”‚   โ”‚   โ”œโ”€โ”€ week2.md
โ”‚   โ”‚   โ”œโ”€โ”€ week3.md
โ”‚   โ”‚   โ”œโ”€โ”€ week4.md
โ”‚   โ”‚   โ””โ”€โ”€ week5.md
โ”‚   โ””โ”€โ”€ runtime/
โ”‚       โ””โ”€โ”€ filefly.log
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ filefly/
โ”‚       โ”œโ”€โ”€ static/
โ”‚       โ”‚   โ”œโ”€โ”€ script.js
โ”‚       โ”‚   โ””โ”€โ”€ styles.css
โ”‚       โ”œโ”€โ”€ templates/
โ”‚       โ”‚   โ””โ”€โ”€ index.html
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ”œโ”€โ”€ __main__.py
โ”‚       โ”œโ”€โ”€ app.py
โ”‚       โ”œโ”€โ”€ cli.py
โ”‚       โ”œโ”€โ”€ config.json
โ”‚       โ”œโ”€โ”€ filefly.db
โ”‚       โ”œโ”€โ”€ inspect_data.py
โ”‚       โ”œโ”€โ”€ logging_config.py
โ”‚       โ”œโ”€โ”€ main.py
โ”‚       โ”œโ”€โ”€ reporter.py
โ”‚       โ”œโ”€โ”€ runtime_status.json
โ”‚       โ”œโ”€โ”€ storage.py
โ”‚       โ””โ”€โ”€ telemetry.py
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ MANIFEST.in
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ requirements.txt

Contributing and Development

Want to contribute to Filefly? Just clone this repository on your system and make any necessary changes:

git clone https://github.com/YodaheWondimu/Filefly.git
cd filefly
pip install -r requirements.txt
python -m filefly

License

Filefly is released under the MIT License. See LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filefly_files-0.2.0a2.tar.gz (20.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

filefly_files-0.2.0a2-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file filefly_files-0.2.0a2.tar.gz.

File metadata

  • Download URL: filefly_files-0.2.0a2.tar.gz
  • Upload date:
  • Size: 20.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for filefly_files-0.2.0a2.tar.gz
Algorithm Hash digest
SHA256 5ca51778b754dde5097a42299fed85ef8f8d24353a5dee0a2af3df5cafee1cb8
MD5 132c368d8d103db8512b4ccdd5e88ab8
BLAKE2b-256 9e6de325787bb797630f16ee50bec1b1d12ce2991e11929d1307c07d48844b8d

See more details on using hashes here.

File details

Details for the file filefly_files-0.2.0a2-py3-none-any.whl.

File metadata

File hashes

Hashes for filefly_files-0.2.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 317ba7513efa07d455038015cca7de18a63606ed658c53570aa27df4fb1a0b0b
MD5 b8ad23a9fa36189a8f1ae332f624a013
BLAKE2b-256 7f7e6d098503a575be8aec3c0b7f0b35c1b93abda5f9e8367e9bd1dea3751422

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page