Detects files in a directory and its subdirectories.
Project description
FileDetect
Table of Contents
Installation
pip install filedetect
Description
filedetect is a Python package that provides a simple and efficient way to detect file formats in a directory tree.
One can specify the suffixes to detect, the maximum depth of detection, and the formats to look for.
For convinence, some common suffixes are already defined for the following formats:
- video => {".mp4", ".mkv", ".avi", ".mov", ".wmv", ".flv", ".webm", ".m4v", ".mpeg", ".mpg", ".3gp", ".3g2", ".vob", ".ts", ".m2ts", ".rmvb", ".mxf", ".drc", ".amv", ".f4v", ".svi", ".m1v", ".m2v"}
- audio => {".mp3", ".wav", ".flac", ".aac", ".ogg", ".wma", ".m4a", ".opus", ".alac", ".aiff"}
- image => {".jpg", ".jpeg", ".png", ".gif", ".bmp", ".tiff", ".webp", ".svg", ".heif", ".raw"}
- plain_text => {".txt"}
- csv => {".csv"}
- json => {".json"}
- html => {".html", ".htm"}
- xml => {".xml"}
- excel => {".xls", ".xlsx", ".odf", ".ods"}
- pdf => {".pdf"}
- doc => {".doc", ".docx", ".odt"}
- ppt => {".ppt", ".pptx", ".odp"}
- archive => {".zip", ".tar", ".gz", ".bz2", ".xz", ".7z", ".rar", ".tar.gz", ".tar.bz2", ".tar.xz"}
- text => {".txt", ".csv", ".json", ".xml", ".html", ".htm", ".py", ".java", ".c", ".cpp", ".h", ".js", ".css", ".md", ".rst", ".tex", ".sql", ".css", ".yaml", ".yml", ".toml", ".ini", ".properties", ".log"}
But you can also specify your own suffixes to detect by passing a set of suffixes to the suffixes parameter instead of the format parameter.
And you also are welcome to suggest new formats to be added to the package or existing formats to be updated by creating a pull request or an issue on the GitHub repository !
Usage
CLI
filedetect [-h] [--list_formats] [--version] path [--format {video,image,audio,plain_text,csv,json,html,xml,excel,pdf,doc,ppt,archive,text,all,}] [--deep int] [--only_stems stem1,stem2] [--suffixes sfx1,sfx2]
-
The
pathargument is required and should be the path to the directory you want to scan. -
You can either specify the
--formator the--suffixesto detect, but not both at the same time. If both aren't specified, all formats will be detected. -
The
--only_stemsoption allows you to specify a set of stems to detect to filter for but is optional. -
The
--deepoption allows you to specify the maximum depth of detection. If not specified, the default value is -1, which means unlimited depth. -
The
--list_formatsoption allows you to list all the formats available in the package and their corresponding suffixes and exits. -
The
--versionoption allows you to display the version of the package and exits. -
The
--helpoption allows you to display the help message and exits."
Python
from filedetect import FileDetect
detector = FileDetect.run(
path="path/to/dir",
format=None, # ["video", "audio", "image", "text", "csv"; "json", "html"] | None for all formats
deep=-1, # positive int for maximum depth of detection or -1 for unlimited
only_stems=None, # None | a set of stems to detect,
suffixes=None, # None | a set of suffixes to detect (in place of format),
)
print(detector.results)
or
from filedetect import FileDetect
detector = FileDetect(
format=None, # ["video", "audio", "image", "text", "csv"; "json", "html"] | None for all formats
deep=-1, # positive int for maximum depth of detection or -1 for unlimited
only_stems=None, # None | a set of stems to detect,
suffixes=None, # None | a set of suffixes to detect (in place of format),
)
detector.run(
path="path/to/dir",
)
detector.run(
path="another/path/to/dir",
)
...
print(detector.results)
License
filedect is distributed under the terms of the AGPLv3 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file filedetect-0.1.0.tar.gz.
File metadata
- Download URL: filedetect-0.1.0.tar.gz
- Upload date:
- Size: 18.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9da22928c43cfd99ed75851e7ec26babc18b66dae359f39783b1f0480ba3d6f8
|
|
| MD5 |
396f88da0343bbb56a879969d315bc9e
|
|
| BLAKE2b-256 |
4888502a999ba9bfc5c1fce7582644d1302af46d744f56740f20d63bea4a1db2
|
File details
Details for the file filedetect-0.1.0-py3-none-any.whl.
File metadata
- Download URL: filedetect-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0040cdc010bf0179ca6109db59d3b7a04efe185ebd84cf40e2bbfab0db88554
|
|
| MD5 |
4f57d3780f57c92e466b98df7043a831
|
|
| BLAKE2b-256 |
dcd6bae48a64ad6d2df854aabfc9924417f3c89b9889e8abb0f6ef75dc3afbe0
|