A tool to automatically classify files based on extension or timestamp
Project description
File Classifier
A powerful, cross-platform Python utility for organizing files and folders automatically based on extensions, folder names, or timestamps.
Features
-
Multiple Classification Methods:
- Extension-based: Classifies files into categories based on their extensions
- Time-based: Organizes files by creation, modification, or access time
- Folder classification: Categorizes folders based on common naming patterns
-
File Categories:
- Documents (PDF, DOC, TXT, etc.)
- Images (JPG, PNG, GIF, etc.)
- Audio (MP3, WAV, FLAC, etc.)
- Videos (MP4, AVI, MKV, etc.)
- Archives (ZIP, RAR, 7Z, etc.)
- Code (PY, JAVA, HTML, etc.)
- Executables (EXE, MSI, APP, etc.)
- Others (for extensions not in the above categories)
-
Folder Categories:
- Projects: For development and project-related folders
- Backups: For backup and archived content
- Documents: For document and report folders
- Media: For photos, videos, music folders
- Downloads: For download folders
- Applications: For software and apps
- Data: For datasets and databases
- Web: For web-related content
- Dated: For folders with date patterns (auto-detected)
- Versioned: For folders with version patterns (auto-detected)
- Uncategorized: For folders that don't match any patterns
-
Operation Modes:
- Move (default): Moves files/folders to target directories
- Copy: Creates copies instead of moving
- Symlink: Creates symbolic links to original files/folders
- Dry-run: Shows what would happen without making changes
-
Cross-Platform Compatibility:
- Works on Windows, macOS, and Linux
Requirements
- Python 3.6 or higher
- No additional libraries required (uses standard library only)
Installation
Download the script and make it executable:
chmod +x file_classifier.py
Or run it directly with Python:
python file_classifier.py [options]
Usage
Basic Usage
python file_classifier.py SOURCE_DIR [TARGET_DIR]
If TARGET_DIR is not specified, files will be organized in a new directory called ./classified.
Common Options
-l, --symlinks Create symlinks instead of moving files
-c, --copy Copy files instead of moving them
-d, --dry-run Show what would be done without actually doing it
-f, --folders Include folders in the classification
Classification Methods
-e, --extensions Classify by file extensions (default behavior)
-t, --time Organize by time attribute
Time-based Organization Options
--time-attr {modified,created,accessed}
Time attribute to use (default: modified)
--time-format FORMAT
Time format for directories (default: '%Y-%m' for year-month)
Collision Handling Options
--on-conflict {skip,rename,overwrite}
How to handle file conflicts (default: skip)
- skip: Skip files that already exist
- rename: Rename new files (file.txt → file_1.txt)
- overwrite: Replace existing files
Examples
See the examples/ directory for detailed, runnable examples:
- Basic Organization - Classify files by extension
- Time-Based - Organize by modification date
- Copy Mode - Copy instead of move
- Dry Run & Conflicts - Preview and handle duplicates
Quick Examples
Extension-based Organization
# Classify all files in Downloads folder by extension
python file_classifier.py ~/Downloads ~/Organized
# Classify files and folders, create copies instead of moving
python file_classifier.py ~/Documents ~/Organized -f -c
# Create symlinks instead of moving files
python file_classifier.py ~/Pictures ~/Organized -l
# Preview what would happen without making any changes
python file_classifier.py ~/Desktop -d
Time-based Organization
# Organize files by their modification time (year-month)
python file_classifier.py ~/Documents ~/TimeOrganized -t
# Organize by creation date with year-month-day format
python file_classifier.py ~/Photos ~/Chronological -t --time-attr created --time-format "%Y-%m-%d"
# Organize files and folders by access time
python file_classifier.py ~/Downloads ~/AccessOrganized -t --time-attr accessed -f
Troubleshooting
Issue: Files are being skipped
Cause: Files already exist in target directory
Solution: Files with the same name in the target will be skipped by default. Check the logs for "already exists" warnings.
Issue: Want to preview changes first
Solution: Use --dry-run flag to see what would happen without making changes:
python file_classifier.py ~/Downloads ~/Organized --dry-run
Issue: Permission denied errors
Cause: Insufficient permissions to read source or write to target directory
Solution:
- Check directory permissions
- Run with appropriate user privileges
- Ensure target directory is writable
Issue: Symlinks not working on Windows
Cause: Symlink creation requires administrator privileges on Windows
Solution: Run terminal as administrator, or use --copy mode instead
FAQ
Q: What happens if a file or folder already exists in the target directory? A: By default, the script will skip it and log a warning message. The file will remain in the source directory.
Q: Can I undo the organization? A: Currently, there's no built-in undo feature. We recommend:
- Use
--dry-runfirst to preview changes - Use
--copymode to keep original files - Keep backups of important directories
Q: Will the organization preserve the directory structure?
A: No, all files are flattened to the corresponding category directories. For hierarchical organization, consider using the time-based organization with a hierarchical format like %Y/%m/%d.
Q: Can I customize file categories?
A: Yes, you can edit the FILE_CATEGORIES dictionary in the script to add or modify categories. Configuration file support is planned for future releases.
Q: Does it work on Windows?
A: Yes! The tool uses Python's pathlib for cross-platform compatibility and works on Windows, macOS, and Linux.
Q: Can I organize files in subdirectories? A: Currently, only files in the top level of the source directory are processed. Recursive mode is planned for future releases.
Q: How do I install the package? A: Install via pip:
pip install fl-classifier
Then run with:
fl-classifier ~/Downloads ~/Organized
# or
python -m fl_classifier ~/Downloads ~/Organized
License
This utility is released under the MIT License. Feel free to use, modify, and distribute it.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fl_classifier-0.2.0.tar.gz.
File metadata
- Download URL: fl_classifier-0.2.0.tar.gz
- Upload date:
- Size: 12.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9eb76f5b0a5eae0ef3be8392046067f953554e88e3ec5b5076e023e45435bfb8
|
|
| MD5 |
2dee5a5ccd3af7f8198f52501f4b00e9
|
|
| BLAKE2b-256 |
1aeb8e7c1f48cff94dd6e356098ed748a78284ca9d460b47867ea01fa040f571
|
Provenance
The following attestation bundles were made for fl_classifier-0.2.0.tar.gz:
Publisher:
python-publish.yml on bri-anadi/fl-classifier
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fl_classifier-0.2.0.tar.gz -
Subject digest:
9eb76f5b0a5eae0ef3be8392046067f953554e88e3ec5b5076e023e45435bfb8 - Sigstore transparency entry: 845762980
- Sigstore integration time:
-
Permalink:
bri-anadi/fl-classifier@ea2b448862c311ddf9dd84c6f5791f26aa25e892 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/bri-anadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ea2b448862c311ddf9dd84c6f5791f26aa25e892 -
Trigger Event:
release
-
Statement type:
File details
Details for the file fl_classifier-0.2.0-py3-none-any.whl.
File metadata
- Download URL: fl_classifier-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5b7a081d774e55ac613db3f5293df8e38b2a0a9540c1a5d33a6ee19ab27091e
|
|
| MD5 |
87a704689db82f73eb7b738ef502ddee
|
|
| BLAKE2b-256 |
1d824bb38586b8d31a375008803e301861427d1779f9744d4accf835715a7eee
|
Provenance
The following attestation bundles were made for fl_classifier-0.2.0-py3-none-any.whl:
Publisher:
python-publish.yml on bri-anadi/fl-classifier
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fl_classifier-0.2.0-py3-none-any.whl -
Subject digest:
b5b7a081d774e55ac613db3f5293df8e38b2a0a9540c1a5d33a6ee19ab27091e - Sigstore transparency entry: 845762990
- Sigstore integration time:
-
Permalink:
bri-anadi/fl-classifier@ea2b448862c311ddf9dd84c6f5791f26aa25e892 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/bri-anadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ea2b448862c311ddf9dd84c6f5791f26aa25e892 -
Trigger Event:
release
-
Statement type: