Directory tree metadata parser using Apache Tika
Project description
tikatree
Directory tree metadata parser using Apache Tika
tikatree parses all files in a directory and creates a:
- _metadata.json - A single file with the metdata from each file that was parsed
- _file_tree.json and _file_tree.csv - A list of all files and directories with some basic information. One file that's easy to read and another for importing into excel and things like that
- _directory_tree.txt - A graphical representation of the directory
- .sfv - A CRC32 checksum
Installation
pip install tikatree
tikatree uses tika-python for interacting with Apache Tika. You may need to refer to the tika-python documentation if you have any issues with Tika.
Usage
Open up a command line and type tikatree <directory>
, by default it'll create all files at or above that directory. You can target multiple directories, just put a space in between each one on the command line.
usage: tikatree [-h] [-v] [-d] [-e EXCLUDE [EXCLUDE ...]] [-f] [-k] [-m] [-nm] [-s] [-y] DIRECTORY [DIRECTORY ...]
A directory tree metadata parser using Apache Tika, by default it runs arguments: -d, -f, -m, -s
positional arguments:
DIRECTORY directory(s) to parse
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-d, --directorytree create directory tree
-e EXCLUDE [EXCLUDE ...], --exclude EXCLUDE [EXCLUDE ...]
directory(s) to exclude, includes subdirectories
-f, --filetree creates a json and csv file tree
-k, --kill kill Tika process after each directory parsed
-m, --metadata parse metadata
-nm, --newmetadata create individual metadata files in a 'tikatree' directory
-s, --sfv create sfv file
-y, --yes automatically overwrite older files
Example
I've included some output examples in the output_examples
folder.
Windows Fixes
When parsing files too fast there can be connection errors to Apache Tika. In order to get around this run these commands in Powershell as Admin
$KeyPath = "HKLM:\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters"
Set-ItemProperty -Path $KeyPath -Name "MaxUserPort" -Value 65534
$KeyPath = "HKLM:\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters"
Set-ItemProperty -Path $KeyPath -Name "TcpTimedWaitDelay" -Value 30
$KeyPath = "HKLM:\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters"
Set-ItemProperty -Path $KeyPath -Name "StrictTimeWaitSeqCheck" -Value 1
Part of the Keep Dreaming Project
Main Repository
Project
GitHub Mirror
Contributing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tikatree-0.1.1.tar.gz
.
File metadata
- Download URL: tikatree-0.1.1.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 566bf2784aa0a44b2c420fc51748cb3831f3c1f3fb3da366586d712b5ce1480d |
|
MD5 | 5963d3111dd7ccc65e529da7aa1cdbc4 |
|
BLAKE2b-256 | fcde9ac32e55b8d2a905427f4578860230f1a0dc9808cff65210cb1c7818d00e |
File details
Details for the file tikatree-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: tikatree-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a998d71141720479fba0208e8ffff046644274a6c5db9f1b22806b13b6d425b8 |
|
MD5 | 3bd7b1195c07f173aba6b66d5c1c3ec6 |
|
BLAKE2b-256 | de157dfb84d4c145e64cd38f9725941c417121af2ffdcc1fb51c35d32535812e |