Skip to main content

A tool to automatically classify files based on extension or timestamp

Project description

File Classifier

A powerful, cross-platform Python utility for organizing files and folders automatically based on extensions, folder names, or timestamps.

PyPI version Python Versions License: MIT Downloads GitHub stars

EN ID FR ES ZH

Features

  • Multiple Classification Methods:

    • Extension-based: Classifies files into categories based on their extensions
    • Time-based: Organizes files by creation, modification, or access time
    • Folder classification: Categorizes folders based on common naming patterns
  • File Categories:

    • Documents (PDF, DOC, TXT, etc.)
    • Images (JPG, PNG, GIF, etc.)
    • Audio (MP3, WAV, FLAC, etc.)
    • Videos (MP4, AVI, MKV, etc.)
    • Archives (ZIP, RAR, 7Z, etc.)
    • Code (PY, JAVA, HTML, etc.)
    • Executables (EXE, MSI, APP, etc.)
    • Others (for extensions not in the above categories)
  • Folder Categories:

    • Projects: For development and project-related folders
    • Backups: For backup and archived content
    • Documents: For document and report folders
    • Media: For photos, videos, music folders
    • Downloads: For download folders
    • Applications: For software and apps
    • Data: For datasets and databases
    • Web: For web-related content
    • Dated: For folders with date patterns (auto-detected)
    • Versioned: For folders with version patterns (auto-detected)
    • Uncategorized: For folders that don't match any patterns
  • Operation Modes:

    • Move (default): Moves files/folders to target directories
    • Copy: Creates copies instead of moving
    • Symlink: Creates symbolic links to original files/folders
    • Dry-run: Shows what would happen without making changes
  • Cross-Platform Compatibility:

    • Works on Windows, macOS, and Linux

Requirements

  • Python 3.6 or higher
  • No additional libraries required (uses standard library only)

Installation

Download the script and make it executable:

chmod +x file_classifier.py

Or run it directly with Python:

python file_classifier.py [options]

Usage

Basic Usage

python file_classifier.py SOURCE_DIR [TARGET_DIR]

If TARGET_DIR is not specified, files will be organized in a new directory called ./classified.

Common Options

-l, --symlinks       Create symlinks instead of moving files
-c, --copy           Copy files instead of moving them
-d, --dry-run        Show what would be done without actually doing it
-f, --folders        Include folders in the classification

Classification Methods

-e, --extensions     Classify by file extensions (default behavior)
-t, --time           Organize by time attribute

Time-based Organization Options

--time-attr {modified,created,accessed}
                     Time attribute to use (default: modified)
--time-format FORMAT
                     Time format for directories (default: '%Y-%m' for year-month)

Collision Handling Options

--on-conflict {skip,rename,overwrite}
                     How to handle file conflicts (default: skip)
                     - skip: Skip files that already exist
                     - rename: Rename new files (file.txt → file_1.txt)
                     - overwrite: Replace existing files

Examples

See the examples/ directory for detailed, runnable examples:

  1. Basic Organization - Classify files by extension
  2. Time-Based - Organize by modification date
  3. Copy Mode - Copy instead of move
  4. Dry Run & Conflicts - Preview and handle duplicates

Quick Examples

Extension-based Organization

# Classify all files in Downloads folder by extension
python file_classifier.py ~/Downloads ~/Organized

# Classify files and folders, create copies instead of moving
python file_classifier.py ~/Documents ~/Organized -f -c

# Create symlinks instead of moving files
python file_classifier.py ~/Pictures ~/Organized -l

# Preview what would happen without making any changes
python file_classifier.py ~/Desktop -d

Time-based Organization

# Organize files by their modification time (year-month)
python file_classifier.py ~/Documents ~/TimeOrganized -t

# Organize by creation date with year-month-day format
python file_classifier.py ~/Photos ~/Chronological -t --time-attr created --time-format "%Y-%m-%d"

# Organize files and folders by access time
python file_classifier.py ~/Downloads ~/AccessOrganized -t --time-attr accessed -f

Troubleshooting

Issue: Files are being skipped

Cause: Files already exist in target directory

Solution: Files with the same name in the target will be skipped by default. Check the logs for "already exists" warnings.

Issue: Want to preview changes first

Solution: Use --dry-run flag to see what would happen without making changes:

python file_classifier.py ~/Downloads ~/Organized --dry-run

Issue: Permission denied errors

Cause: Insufficient permissions to read source or write to target directory

Solution:

  • Check directory permissions
  • Run with appropriate user privileges
  • Ensure target directory is writable

Issue: Symlinks not working on Windows

Cause: Symlink creation requires administrator privileges on Windows

Solution: Run terminal as administrator, or use --copy mode instead

FAQ

Q: What happens if a file or folder already exists in the target directory? A: By default, the script will skip it and log a warning message. The file will remain in the source directory.

Q: Can I undo the organization? A: Currently, there's no built-in undo feature. We recommend:

  • Use --dry-run first to preview changes
  • Use --copy mode to keep original files
  • Keep backups of important directories

Q: Will the organization preserve the directory structure? A: No, all files are flattened to the corresponding category directories. For hierarchical organization, consider using the time-based organization with a hierarchical format like %Y/%m/%d.

Q: Can I customize file categories? A: Yes, you can edit the FILE_CATEGORIES dictionary in the script to add or modify categories. Configuration file support is planned for future releases.

Q: Does it work on Windows? A: Yes! The tool uses Python's pathlib for cross-platform compatibility and works on Windows, macOS, and Linux.

Q: Can I organize files in subdirectories? A: Currently, only files in the top level of the source directory are processed. Recursive mode is planned for future releases.

Q: How do I install the package? A: Install via pip:

pip install fl-classifier

Then run with:

fl-classifier ~/Downloads ~/Organized
# or
python -m fl_classifier ~/Downloads ~/Organized

License

This utility is released under the MIT License. Feel free to use, modify, and distribute it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fl_classifier-0.2.0.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fl_classifier-0.2.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file fl_classifier-0.2.0.tar.gz.

File metadata

  • Download URL: fl_classifier-0.2.0.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fl_classifier-0.2.0.tar.gz
Algorithm Hash digest
SHA256 9eb76f5b0a5eae0ef3be8392046067f953554e88e3ec5b5076e023e45435bfb8
MD5 2dee5a5ccd3af7f8198f52501f4b00e9
BLAKE2b-256 1aeb8e7c1f48cff94dd6e356098ed748a78284ca9d460b47867ea01fa040f571

See more details on using hashes here.

Provenance

The following attestation bundles were made for fl_classifier-0.2.0.tar.gz:

Publisher: python-publish.yml on bri-anadi/fl-classifier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fl_classifier-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: fl_classifier-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fl_classifier-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b5b7a081d774e55ac613db3f5293df8e38b2a0a9540c1a5d33a6ee19ab27091e
MD5 87a704689db82f73eb7b738ef502ddee
BLAKE2b-256 1d824bb38586b8d31a375008803e301861427d1779f9744d4accf835715a7eee

See more details on using hashes here.

Provenance

The following attestation bundles were made for fl_classifier-0.2.0-py3-none-any.whl:

Publisher: python-publish.yml on bri-anadi/fl-classifier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page