Skip to main content

A utility to sort files by type and date.

Project description

Sortium

PyPI version License: GPL v3

Sortium is a high-performance, parallelized Python utility for rapidly organizing file systems. It leverages multiple CPU cores to sort thousands of files into clean, categorized directories based on type, modification date, or custom regex patterns.

Designed for both speed and safety, it is memory-efficient for handling massive directories and automatically prevents file overwrites.


Table of Contents


Key Features

  • Parallel Processing: Utilizes multiple CPU cores to dramatically speed up file moving and organization, especially in large directories.
  • Memory-Efficient: Employs generators to process files one by one, ensuring a tiny memory footprint even with millions of files.
  • Flexible Sorting Methods:
    • sort_by_type: Organize files into categories like Images, Documents, Archives, etc.
    • sort_by_date: Further organize categorized files into date-stamped folders (e.g., 01-Jan-2023).
    • sort_by_regex: Use powerful, custom regex patterns to categorize files recursively.
  • Safe File Operations: Automatically handles file name collisions by appending a counter (e.g., image (1).jpg), preventing accidental data loss.
  • Sort In-Place or to a New Destination: Choose to organize files within their current directory or move them to an entirely separate destination folder.
  • Standalone Utilities: Includes a FileUtils class with helpful methods like recursive file finding (iter_all_files_recursive) and directory flattening (flatten_dir).

Installation

From PyPI

To install the latest stable version from PyPI:

pip install sortium

From Source

To install the latest development version from the repository:

git clone https://github.com/Sarthak-G0yal/Sortium.git
cd Sortium
pip install -e .

Getting Started: Usage Examples

Here are a few examples to get you started quickly.

Example 1: Sort Files by Type

This is the most common use case. It organizes all files in a folder into subdirectories like Images, Documents, Videos, etc.

from sortium.sorter import Sorter

# The folder you want to clean up
source_directory = "./my_messy_downloads_folder"

# Create a Sorter instance
sorter = Sorter()

# Run the sort!
print(f"Sorting files in {source_directory} by type...")
sorter.sort_by_type(source_directory)
print("Done!")

Example 2: Sort Files to a Different Destination

Organize files from a source folder and move the categorized results to a completely different location.

from sortium.sorter import Sorter

source_dir = "./my_source_files"
destination_dir = "./organized_archive"

sorter = Sorter()

# Files from source_dir will be moved to categorized folders inside destination_dir
sorter.sort_by_type(source_dir, dest_folder_path=destination_dir)

Example 3: Advanced Sorting with Regex

Recursively scan a directory and sort files based on custom patterns. This is great for organizing project files, logs, or datasets.

from sortium.sorter import Sorter

project_folder = "./my_data_science_project"
sorted_output = "./sorted_project_files"

# Define categories and their corresponding regex patterns
regex_map = {
    "Datasets": r".*\.csv$",
    "Notebooks": r".*\.ipynb$",
    "Python_Code": r".*\.py$",
    "Final_Reports": r"final_report_.*\.pdf$"
}

sorter = Sorter()
sorter.sort_by_regex(project_folder, regex_map, sorted_output)

Running Tests

To run the full test suite and generate a coverage report, first install the development dependencies:

pip install pytest pytest-cov

Then, from the project's root directory, run:

pytest --cov=sortium

For more details on the test structure, see the Test Suite README.


Documentation

This project uses Sphinx for documentation.

  • Online Documentation: View Documentation

  • To build the documentation locally:

    # Navigate to the docs directory
    cd docs
    # Install documentation requirements
    pip install -r requirements.txt
    # Build the HTML pages
    make html
    

    View the generated files at docs/_build/html/index.html.


Contributing

Contributions are welcome! Please follow these steps to contribute:

  1. Fork the repository.
  2. Create a new branch for your feature or fix (feature/my-feature or fix/my-fix).
  3. Write tests that cover your changes.
  4. Commit your changes using clear, conventional messages.
  5. Open a pull request with a detailed description of your work.

Please follow the Conventional Commits specification. Ensure all code is linted and tested before submitting.


Author

Sarthak Goyal


License

This project is licensed under the GNU General Public License v3.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sortium-1.7.0.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sortium-1.7.0-py3-none-any.whl (27.9 kB view details)

Uploaded Python 3

File details

Details for the file sortium-1.7.0.tar.gz.

File metadata

  • Download URL: sortium-1.7.0.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sortium-1.7.0.tar.gz
Algorithm Hash digest
SHA256 76e0a37e19f607a322bb5d4c288e4668384adbfe583beafdc2e99e20a3ef37da
MD5 f2c5635b8b4559ac8a8c41c15e46968a
BLAKE2b-256 e6d28b77b4240888adb3462bc0bf13a026392e73db6e700d84235f524c356494

See more details on using hashes here.

Provenance

The following attestation bundles were made for sortium-1.7.0.tar.gz:

Publisher: release.yaml on Sarthak-G0yal/Sortium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sortium-1.7.0-py3-none-any.whl.

File metadata

  • Download URL: sortium-1.7.0-py3-none-any.whl
  • Upload date:
  • Size: 27.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sortium-1.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a0ead9f455348aefecd17c0dd825c789d081bb4bc895f6659a82a10da2321808
MD5 80f6f0d7dcde8aa36e7f58f529bc5829
BLAKE2b-256 b38342e423fc07ec735f0743842364b395b8fee4a81073f0e8d8aad969593ab0

See more details on using hashes here.

Provenance

The following attestation bundles were made for sortium-1.7.0-py3-none-any.whl:

Publisher: release.yaml on Sarthak-G0yal/Sortium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page