Skip to main content

Scansort helps to collate and rename book scan images

Project description

Scansort

Scansort helps to collate and rename book scan images

Installation

pip install scansort

Synopsis

scansort [-h] [-v]
         -odd ODD -even EVEN [-missing MISSING]
         [-action {move,copy}] [-o OUTPUT] workdir

Desctiption

When using a book-edge scanner (such as Plustek OpticBook), it is handy to scan two sides of a book separately. This way you do not need to rotate the book to scan the next page. Scanned images from different sides normally make their way into separate directories.

Scansort helps one to collate these directories and rename images accodring to the actual page numbers.

The utility assumes that:

  • The collection of images covers a monotonically increasing range of page numbers (with known missing numbers possible). This implies that front-, body-, and (possibly) back-matter must be scanned and processed separately.
  • Even- and odd-numbered pages are put into separate directories.

Also, see an example of the indented workflow.

Options

workdir argument defines a working directory relative to which all other directory names and paths are interpreted. By default the current directory is used.

All page numbers must correspond to the actual "physical" page numbering in the book.

-odd, -even directory name/path
Source directories with scanned images of odd- and even-numbered pages.

-missing num[,num]*
Comma-separated list of page numbers missing in the source directories (either accidentally skipped during scanning or not present at all).

-action {move,copy}
Whether to preserve or delete the original images from the source directories. Defaults to copy.

-o directory name/path
Output directory for renamed scanned images. Defaults to out and will be created automatically if does not exist.

-h, --help
Show a help message and exit.

-v, --version
Show a version information and exit.

Example

After scanning a book I am normally left with something like this:

$ tree ./book
./book
├── lside
│   ├── scan0001.tiff
│   ├── scan0002.tiff
│     ...
└── rside
    ├── scan0001.tiff
    ├── scan0002.tiff
      ...

2 directories, 198 files

where rside contains even-numbered pages. Suppose I skimmed through the directories and realised I missed two pages: 2 and 6.

Then I run scansort to collate the directories:

$ scansort -odd lside -even rside -missing 2,6 ./book

The utility opens an editor to review the result:

# Please review the correspondence between files and book pages
'./book/lside/scan0001.tiff':   1
'./book/lside/scan0002.tiff':   3
'./book/rside/scan0001.tiff':   4
'./book/lside/scan0003.tiff':   5
'./book/lside/scan0004.tiff':   7
'./book/rside/scan0002.tiff':   8
...

I can edit the page numbers right away or remove all lines to cancel the operation (e.g. if it turns out there are more pages missing). Then I save and close the editor and the pages are collated:

$ tree ./book/out
./book/out
├── scan0001.tif
├── scan0003.tif
├── scan0004.tif
├── scan0005.tif
├── scan0007.tif
    ...
└── scan0200.tif

0 directories, 198 files

Note that the missing page number are omitted. I can then scan those separately and put in place.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scansort-0.2.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

scansort-0.2-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file scansort-0.2.tar.gz.

File metadata

  • Download URL: scansort-0.2.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for scansort-0.2.tar.gz
Algorithm Hash digest
SHA256 3ebd45766917ed408d8cc621c08269822fa00d7bffca0eed6ee594d40f4d594e
MD5 9f96d82d641a3d5b4d260dbdb2a0fa0d
BLAKE2b-256 371c402078b3bc6a276eb55ded8a557469baa3d37cac2160fdd531ebbb95b3ff

See more details on using hashes here.

File details

Details for the file scansort-0.2-py3-none-any.whl.

File metadata

  • Download URL: scansort-0.2-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for scansort-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4cb721e692a4b61c386dfeafbbb262ccd033a9dd151b18cb08caf80277eb8a6b
MD5 771b5d0d1d323298a1f6e5275ea24718
BLAKE2b-256 748f4ec58ff5da9c78a178765406c5747f079ddfd04d6ac22d7d0cc3906d4eb1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page