Analyze multiple files for similarity and/or uniqueness.
Project description
runner
Analyze multiple files for similarity and/or uniqueness. Finding similarities of works duplicated from one or more part of different works to create a seemingly unique one can be difficult because of different strategies being used, but this can be done better with a software like runner.
Requirements
- Arkivist
pip install arkivist
Usage
Install the latest abero package, upcoming versions might introduce unannounced changes, so a virtual environment is a must have before installation.
pip install -U abero
To integrate abero into your Python codes, check the code snippet below:
import abero
abero.analyze(directory, extension="txt", threshold=80, template=None, skipnames=0, group=0, unzip=0, reset=0)
CLI Usage
# usage: runner [-h] -d directory [-e extension] [-c control] [-t threshold] [-u unzip] [-s skipnames] [-g group] [-r reset]
py runner.py -d "<path_to_files>" -e "txt" -c "<path_to_control_file>" -t 1 -u 1 -s 1 -r 1
-d <path>
- Full path of the dirctory containing the files to analyze.-e <txt>
- List of allowed file extensions to analyze.-c <*.txt>
- Full path of the control file.-t <80>
- Threshold level for uniqueness, treats similarity below threshold as unique (1-100; default = 0)-u <0>
- Unzip/extract ZIP files (0-1; default = 0)-s <0>
- Skip files with common names (0-1; default = 0)-g <1>
- Only compare if files contains the same identifier (0-1; default = 1)
Example: student1*_set1*.py >> student2*_set1*.py
-r <0>
- Reset analytics before execution (0-1; default = 0)
Control File
Control file contains words or phrases, checked line-by-line, that are deem allowed to be contained in all files to analyzed; therefore, if found on the test files, it will not be flagged as duplicate work.
Features
- Unzip feature
- File comparison
- Threshold levels
- Skip / group compare
- Diff tool, content viewer
Did you know?
The repository name abero
was inspired from the words aberrant and runner (Latin), which may mean deviating or being absent.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file abero-1.0.4.tar.gz
.
File metadata
- Download URL: abero-1.0.4.tar.gz
- Upload date:
- Size: 124.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 893f6596adf8153a68561e89f8d4323347cfc1076b6a265c2b9e9e963a8988ea |
|
MD5 | 33c302fdbd99a526e5870945a648e257 |
|
BLAKE2b-256 | 9e9f640dce15cfba63cad6282d37b36876d1717ac524bb8e6092bc5bdd9600e6 |
File details
Details for the file abero-1.0.4-py3-none-any.whl
.
File metadata
- Download URL: abero-1.0.4-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16cc13dae7a90ff2758352786a27ed23c5cee70f322e847d46bb48b9f0a0e43a |
|
MD5 | 3bf4fb754d194b8e481e07ac9acec11c |
|
BLAKE2b-256 | ac12b7bfee2cf897360a3c242373a7901e2365ac2d08771b10551de690dc35ae |