Scan directories, apply ignore rules, and chunk file contents.
Project description
FolderScanner
FolderScanner
is a Python package that enables efficient scanning of directory structures, applying ignore rules similar to .gitignore
, and chunking file contents for processing. It's designed to handle large datasets and is ideal for pre-processing tasks in data analysis or machine learning pipelines.
Features
- Recursively scans specified directories.
- Applies ignore patterns to skip specified files and directories.
- Chunks file contents and yields them with their paths for efficient processing.
Installation
To install FolderScanner
, simply use pip:
pip install git+https://github.com/chigwell/FolderScanner.git
Usage
Import and use FolderScanner
in your Python projects as follows:
from folder_scanner import scan_directory
core_folder = '/path/to/your/projects'
ignore_patterns = ['.git', '.dockerignore', '*.log', 'tmp/*']
for file_chunk in scan_directory(core_folder, ignore_patterns):
print(file_chunk)
Contributing
Contributions are welcome! Please feel free to submit pull requests, report bugs, or suggest features on the GitHub issues page.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file FolderScanner-0.1.0.tar.gz
.
File metadata
- Download URL: FolderScanner-0.1.0.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fae914eaebfbd4978e282334f5dde8bd6cec99503581a3a6d802a63d31350b64 |
|
MD5 | ad0785135795ff8c6673077dacd0ce83 |
|
BLAKE2b-256 | 93909562237c291e1c7d91251c6b189426aabd1705b36ec871e64a4d405ddd87 |
File details
Details for the file FolderScanner-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: FolderScanner-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83bc5c8446f8a95bd786d4b37a9d33da7e7e72bfc6c214186f4115fb5e429740 |
|
MD5 | 29b08dc925d18b8b5ad2ec6920d071e1 |
|
BLAKE2b-256 | e80222e474be7f5d93bca63b5a6fe3e5928e029ed1b8cea69811571f0edb9c18 |