Skip to main content

A powerful file organizer and ebook manager for e-book metadata retrieval and renaming with ease

Project description

forgy_logo


forgy

forgy is a powerful file organizer and e-book manager with a command-line interface for reliable retrieval of e-book metadata and easy renaming of PDF e-books.

With forgy, you can automatically extract valid ISBNs from many PDF e-books, get metadata for ebooks using extracted ISBNs, rename 'unknown' books using retrieved metadata, organize a messy file collection into folders according to their formats, and much more. This project arose due to the perceived need to reliably rename e-books with their correct titles while keeping them organized on a computer, without installing and depending on bloated software with busy interface.

The goal is to easily create and maintain a decent personal PDF e-book library, especially when identifying PDF e-books by their names becomes difficult. The name forgy is from the project's roots as a file organizer in Python.

Note: Development and testing was done on a Windows 10 PC, with python 3.12 installed, in such a way as to ensure platform independence. Feel free to try forgy out on other platforms.

Table of Contents


Installation

  1. Verify that you have python installed on your computer.

    Open windows command prompt (windows button + cmd + enter) and check python version using python --version+ enter. You should see your python version, which in this case is 3.12.

    If you don't have python installed, you can download it here

  2. Install forgy directly from PyPI.

    python -m pip install forgy
    

    This installation includes forgy public APIs and its command-line interface. You can also include forgy>=0.1.0 in your requirements.txt to install forgy as a dependency in your project

    🔝 Back to Table of Contents

Usage

forgy can be used via its CLI (recommended) or by importing or calling its public APIs directly. The CLI option currently has more documentation and is therefore recommended. This section assumes that you have installed forgy via pip as earlier explained.

  1. Check whether the commandline tool is properly installed on your computer. Once you enter forgy in your command line, you should see the Namespace object from parser.

    Namespace(subcommands=None)
    Please provide a valid subcommand
    

    If you see the above, forgy CLI should be accessible via command prompt. However, if that is not the case, you may need to add python Scripts to your PATH to enable execution of the CLI.

  2. To view help page to understand all sub-commands available in forgy, pass the h*elp argument to forgy.

    forgy -h

    Sample output:

    usage: forgy [-h] [--version]
              {get_metadata,get_isbns_from_texts,get_single_metadata,organize_extension,get_files_from_dir,copy_directory_contents,move_directories,delete_files_directories}
              ...
    
    A powerful file organizer, ebook manager, and book metadata extractor in python
    
    options:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
    
    forgy Operations:
      Valid subcommands
    
      {get_metadata,get_isbns_from_texts,get_single_metadata,organize_extension,get_files_from_dir,copy_directory_contents,move_directories,delete_files_directories}
     get_metadata    retrieve PDF e-book metadata and rename several PDF e-books with it
     get_isbns_from_texts
                         extract isbns from several PDF e-books contained in source_directory
     get_single_metadata
                         get metatada for a single book using file path and title or isbn
     organize_extension  organize files by extension or format
     get_files_from_dir  aggregate pdf files from various directories/sources
     copy_directory_contents
                            copy contents of source directory into destination directory (files and directories included)
     move_directories    move directories to another destination
     delete_files_directories
                             delete files or directo- ries in source directory. WARNING: permanent operation!
    
    

Welcome to forgy v0.1.0!


From the above, there are eight major sub-commands you can use to carryout various operations on your files and directories. These include:

  • get_metadata
  • get_isbns_from_texts
  • get_single_metadata
  • organize_extension
  • get_files_from_dir
  • copy_directory_contents
  • move_directories
  • delete_files_directories

The function of the above sub-commands are as stated in the command-line help shown earlier. You can view usage of sub-commands using: forgy sub-command --help.

Note that the get_metadata sub-command requires an optional GoogleBooks API key. This get_metadata sub-command is built on two major books API (Google and Openlibrary) which are freely available.

Openlibrary API is available for free with some API request per minute per IP limit to enforce responsible usage. Google BooksAPI, on the other hand has a default quota of about 1000 free API calls per month per IP, which can theoretically be increased via the Google cloud console.

To avoid overwhelming a single API and gain access to more book metadata, providing Google BooksAPI key is recommended and forgy randomly selects between these two APIs for metadata retrieval.

Google BooksAPI key can be obtained via Google Cloud Console .

On the home page:
Select a project if existing or Create new (right beside Google Console Logo) > New Project > Create > Left hand menu > APIs and Services > Credentials >
> Create Credentials > API Key (API key created and displayed in dialog box. Copy it and use) > Close dialog > API key (optional) > API Restrictions >
> Restrict key > Google Cloud APIs > OK

🔝 Back to Table of Contents

Example

Task: Extract all valid ISBNs from all PDF books located in a directory

Using forgy CLI (recommended)

First, we view command-line help to identify a sub-command for ISBN extraction. Looking at the sample output above (see sample output in usage section), the get_isbns_from_texts sub-command is the one that extract isbns from several PDF e-books contained in source_directory. For the sake of simplicity, we keep all PDF e-books inside one folder and then we view help page for get_isbns_from_texts sub-command to understand how to use it.

forgy get_isbns_from_texts -h

Sample output:

usage: forgy get_isbns_from_texts [-h] [--isbn_text_filename ISBN_TEXT_FILENAME] source_directory destination_directory

Extract valid ISBNs from PDF files as a dictionary with filenames as keys and valid ISBNs as a list of values

positional arguments:
  source_directory      provide source directory for input pdf files
  destination_directory
                        provide destination for text file containing book titles and extracted isbns
options:
  -h, --help            show this help message and exit
  --isbn_text_filename ISBN_TEXT_FILENAME
                        provide name of text file containing extracted e-book isbns

The usage of the sub-command is shown on the first line in the help screen above. Only two postional arguments (source_directory and destination_directory) are mandatory here, while the name of the text file to contain extracted valid ISBNs is optional (the default name is extracted_isbns.txt).

The source_directory contains PDF files to extract ISBNs from and the destination_directory is the location on your computer where the file containing extracted ISBNs is saved. The format of the output is a text file containing file names as keys and extracted valid ISBNs as a list of values and the ISBN text file is found in the destination_directory defined.

The command to extract ISBNs from texts, contained in source-directory into a text file located in destination-directory with both source-directory and destination_directory located in user's desktop directory:

forgy get_isbns_from_texts C:\Users\User-name\Desktop\source-directory C:\Users\User-name\Desktop\destination-directory

Once you press the enter key, ISBN extraction from all PDF files in C:\Users\User-name\Desktop\source-directory takes place.


🔝 Back to Table of Contents

Using forgy public APIs

  1. Import the get_isbns_from_texts function to execute the current task and pathlib.Path from python standard library to properly handle the path to source and destination directories.
    >>> from forgy.messyforg import get_isbns_from_texts
    
  2. Define the source and destination directories.
    >>> source_directory = Path(r'C:\Users\USER-NAME\Desktop\SOURCE-DIRECTORY')
    >>> txt_destination_dir = Path(r'C:\Users\USER-NAME\Desktop')
    
  3. Get ISBNs from all PDF e-books in the source directory by calling the imported function.
    >>> get_isbns_from_texts(source_directory, txt_destination_dir)
    

Note: API documentation for forgy is still in progress and the CLI option is much more documented at this point and is therefore recommended. Feel free to explore forgy internals. In the next section, you will learn how to set up forgy locally on your computer and explore the workings of its modules and the public APIs within them.

🔝 Back to Table of Contents


Setting up forgy locally

  1. Verify that you have python installed on your computer.

    Open windows command prompt (windows button + cmd + enter) and check python version using python --version+ enter. You should see your python version, which in this case is 3.12.

    If you don't have python installed, you can download it here

  2. Navigate to directory where you want to keep the cloned forgy that you are about to download.

    To download into desktop directory, use the change directory command as shown below.

    cd desktop
    

    Alternatively, you can create a directory to contain cloned forgy using mkdir new_directory_name at the command prompt.

  3. Clone the repository.

    You need git installed to clone a repo on Windows. If you don't already use git for version control, you may download git for windows here , install and open the downloaded git bash, navigate to the destination directory for the cloned forgy repo (desktop in this case) and clone repository using the clone command (in git) as shown below.

    cd desktop
    
    git clone https://github.com/misterola/forgy.git
    

  4. Re-open Windows command prompt and navigate to the project root directory (desktop/forgy). You may use the command prompt henceforth.

    cd forgy
    

  5. Create virtual environment.

    python -m venv venv
    

  6. Activate virtual environment.

    You should see '(venv)' in front of your current path in command prompt after activating virtual environment.

    venv\Scripts\activate
    

  7. Install dependencies.

    python -m pip install -r requirements.txt
    

  8. You can leave virtual environment at any point using deactivate command prompt.


  9. Navigate to cli package in src directory. The main.py module contains the CLI logic.

cd src/cli

11. To view help page to understand all subcommands available.
python -m main -h

🔝 Back to Table of Contents

License

GNU Affero General Public License (AGPL-3.0)

🔝 Back to Table of Contents

Dependencies


🔝 Back to Table of Contents

TODO

  • More testing
  • Enable extraction of metadata without modifying e-book titles
  • More refactoring
  • Create a simple, minimal GUI

Back to Top

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forgy-0.1.0.tar.gz (60.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

forgy-0.1.0-py3-none-any.whl (56.2 kB view details)

Uploaded Python 3

File details

Details for the file forgy-0.1.0.tar.gz.

File metadata

  • Download URL: forgy-0.1.0.tar.gz
  • Upload date:
  • Size: 60.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for forgy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e7f504d55247cf4bf5e06165f07b3dc276a1f7adb1576db37cd1bdaf1d0a519d
MD5 03f72ec204ae79c4a82e00b17300ce6c
BLAKE2b-256 927cba3c6c64dbec721fb301f278dc0f8a592d8fb8f7f6c5d41a655d494df30a

See more details on using hashes here.

File details

Details for the file forgy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: forgy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 56.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for forgy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4cf90b6060e19bd647cf1e3ba7abfd9377092e343b1d43b0f19cd920aed0bc1b
MD5 31095e5c0fae6184efc0e666bf989c2e
BLAKE2b-256 db65da793e4ac7067b9cce730c3cc4a161991d7c1e29073dc10c170f1204b82a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page