Python library to find missing files in a scheduled delivery.
Project description
filehole
Python library to find missing files in a scheduled delivery.
Simple and quick solution to help finding missing files in a scheduled delivery, particularly when dealing with large amount of files or a long history of file delivery.
Dependencies
- Python 3.8.9 or higher
- Numpy 1.23.0
- Holidays 0.14.2
Install
The latest stable version can always be installed or updated via pip:
$ pip install filehole
Usage
filehole(
path_to_files: str,
file_system: Globable,
date_pattern: str,
date_format: str,
country: str,
subdivision: str = None,
start_date: str = f"{date.today().year}-01-01",
end_date: str = date.today().strftime("%Y-%m-%d"),
week_schedule: str = "1111100",
frequency: str = "D",
repetition: int = 1,
position: int = 1,
)
Parameters:
path_to_files
: Wild card enabled string to search for filesfile_system
: Modules that have aglob
function such asglob
in a local environment oradls
in a cloud environment.date_pattern
: Regular expression reflecting the pattern in which the date is written in files or directories.date_format
: Standard date format of the date written in files or directories.country
: Country name or abbreviation for the selection of the holidays calendar. For the exhaustive list of available holidays calendars, please refer to the documentation of theholidays
python library (https://pypi.org/project/holidays/).subdivision
: Province, state, ... for the selection of the holidays calendar. The available option can be found in the documentation of theholidays
python library (https://pypi.org/project/holidays/).start_date
: Start of the search period. Format:'%Y-%m-%d'
. Default is set to the first day of the current year.end_date
: End of the search period. Format:'%Y-%m-%d'
. Default is set to the current date.week_schedule
: String of 7 digits of 0 and 1. 1 represents a working day and 0 a non-working day. Week starts on Monday. By default, the working week is set from Monday to Friday included ->'1111100'
.frequency
: Takes'D'
for daily delivery,'W'
for weekly delivery and'M'
for monthly delivery.repetition
: Default value:1
. Used only for weekly and monthly file delivery. e.g.:repetition=1
-> every week/month,repetition=2
-> every two weeks/months...position
: Takes1
for first business day of the month or-1
for last business day of the month.
Description:
Retrieve list of files from a given location.
Extract dates from filenames.
Create a calendar of holidays.
Create a set of expected dates and compare them to the extracted dates.
Return a set of missing dates.
Example:
- Daily file delivery for the month of July according to the french holiday calendar, assuming that files from the 12 and 13 of July are missing:
> filehole(
path_to_files="my_file_path/*.txt",
file_system=glob,
date_pattern=r"[0-9]{8}",
date_format="%Y%m%d",
country="FR",
start_date="2022-07-01",
end_date="2022-07-31",
week_schedule="1111100",
frequency="D",
)
> {datetime.date(2022, 7, 12), datetime.date(2022, 7, 13)}
Limitations
- All files are expected to be at the same level.
- The files or the directory containing the files should contain a date in their name.
- Current version works only with daily, weekly and monthly file delivery.
License
This project is under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file filehole-0.0.4.tar.gz
.
File metadata
- Download URL: filehole-0.0.4.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f6a488cae9460fcb981b2864e1b8b246ea9a35d1ee40ee6418cdf98274458ea |
|
MD5 | f045acca01e6bc5acd4da3c669a58030 |
|
BLAKE2b-256 | aba30bd45a549cd5c7dad7f0ee71b24454c04020c61a3ccd4d8c7c8dfcf6a239 |
File details
Details for the file filehole-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: filehole-0.0.4-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86a8232d806a5cb7ce5ced9568f000b812a6641d74c1f6ecefa8a1f993a416fc |
|
MD5 | 3ca23b135cbde150c77af88ff3a5993b |
|
BLAKE2b-256 | a1c1e330e46dd0632dd1e61d05928283649fbff1e7aee0ef206733b10041c588 |