Helps validate the integrity of data backups/exports.
Project description
spot_check_files
This is a tool to help validate the integrity of a set of files, e.g. data backups/exports.
- Checks recognized file types for errors, e.g. invalid json.
- Generates thumbnails of files when possible.
- Displays statistics about file types and unrecognized files.
It produces a report like the following in the terminal (seeing images in the terminal requires iTerm2):
Or as HTML:
Usage
Install:
- Install python3 and pip
pip3 install spot_check_files[imgcat]
- imgcat is optional and enables support for displaying thumbnails in iTerm2 on OS X
Run:
spotcheck PATH
This will output basic stats and any errors the tool detects in the given files/directories. If you're using iTerm2 on Mac, it will also show thumbnails of files.
Alternatively, you can generate an HTML report:
spotcheck -H PATH > out.html
The full list of options can be seen here or by running spotcheck --help
.
This tool can also be used programmatically.
The main entry point for the library is the CheckerRunner
class in spot_check_files.checker.
You can add support for new file types by subclassing the Checker
class from that module.
Supported file types
The command-line tool currently relies entirely on file extension to determine file types.
Type | Support |
---|---|
Archive files:
|
Recursively checks all the files in the archive (including other archives) |
CSV files:
|
Checks that the CSV dialect can be detected and read by Python, and builds a thumbnail |
Image files:
|
Checks that the file can be loaded by the Python imaging library Pillow, and builds a thumbnail |
JSON files: .json |
Checks that the json can be parsed, and builds a thumbnail of the pretty-printed json |
Text files:
|
Treating the file as plaintext, builds a thumbnail |
XML files: .xml |
Checks that the xml can be parsed, and builds a thumbnail of the pretty-printed xml |
anything supported by OS X Quick Look (HTML, Office docs, ...) | OS X ONLY: generates thumbnails using Quick Look. This greatly increases the number of supported file types. However, it's slow. |
Development
Setup:
- Install python3 and pip
- Clone the repo
- I recommend creating a venv:
cd spot_check_files python3 -m venv venv source venv/bin/activate
- Install dependencies:
pip install . pip install -r requirements-dev.txt
To run tests:
PYTHONPATH=src pytest
(Overriding PYTHONPATH as shown ensures the tests run against the code in the src/ directory rather than the installed copy of the package.)
To run the CLI:
PYTHONPATH=src python -m spot_check_files ...
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/brokensandals/spot_check_files.
License
This is available as open source under the terms of the MIT License.
This package includes and uses a copy of the Monoid font, which is also MIT-licensed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file spot_check_files-0.0.2.tar.gz
.
File metadata
- Download URL: spot_check_files-0.0.2.tar.gz
- Upload date:
- Size: 49.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81114989988b4c2b9efe3fa30534ec26b765a2296ab7a9704f6dcf83516015d7 |
|
MD5 | 4c8e9f0509cde52e48b49069e4df369c |
|
BLAKE2b-256 | bad9676de36aa0e5b69a8f84d82d9d259ae1248c8e501a094d2d7bfb57d7b7fb |
File details
Details for the file spot_check_files-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: spot_check_files-0.0.2-py3-none-any.whl
- Upload date:
- Size: 51.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 809f7c8893da95c4e42447c7eaf6cddff96abe3050f33982cfc7d7857b7c8c5d |
|
MD5 | bd637d8306391d2529c603bbd301e924 |
|
BLAKE2b-256 | da9b55fc84d4279329d50d5139871e176d4bda409484bc7294e230789b11964e |