Skip to main content

Similiar to locate, but looks at the content

Project description

smartlocate - Intelligent File Indexer

smartlocate is a tool for Linux that uses YOLO and many other AI tools to detect objects in images, describe images and creates a database of detected objects, image descriptions, text contents and so on, and makes them searchable. This database is stored locally and allows you to search for specific objects in images. smartlocate uses an SQLite database to efficiently store and search data.

If the parameter --ocr is set while indexing, all images are also OCRed and the found text is searchable. You can set the language with --lang_ocr tr for example. Default is ["de", "en"].

If the parameter --describe is set while indexing, the model Salesforce/blip-image-captioning-large will be used to generate descriptions of images automatically, which also then can be searched.

Screenshots

Indexing

This shows the indexing process, with --face_recognition enabled. This means it asks for a name the first time a face is shown, but later on, it detects it automatically and can associate the face with a name, making it easily searchable.

Indexing

Face recognition while indexing

While indexing, with --face_recognition, faces are recognized. If the face cannot be automatically determined, it will ask you for the name of the person. For later images, this person will (most probably) be automatically detected again without any intervention.

Face Recognition

If you don't want to wait manually for a long time, you can run smartlocate with --dont_ask_new_faces. This will skip images where person are found, but cannot be determined. This way, you can run it through a whole folder over night without manual intervention, and then run it again after it's done without that option, so that you get asked for all new faces. This way, you don't get longer waiting periods before entering names again.

Searching

Images of cats and dogs

These images were not manually labelled. Those labels were found by AI!

Search: Dog

Search: Cat

Searching through Documents

This is a search on OCR'ed documents.

OCR

Features

  • Easy to install and use.
  • Object detection in images using YOLO.
  • OCR is done via easyocr, when --ocr was set during indexing. Allows you to use % as a wildcard.
  • Qr-Code-Detection and indexing.
  • Documents are converted with pandoc. Allowed document types are: ['.doc', '.docx', '.pptx', '.ppt', '.odp', '.odt', '.md', '.txt', '.pdf']. Use --documents while indexing for finding documents.
  • Stores detected objects in a local SQLite database (~/.smartlocate_db).
  • Fast searching for specific objects in images.
  • Supports Sixel graphics for visualizing results.
  • Automatic face recognition (use --face_recognition while indexing). It will ask you (hopefully only once) per person what their name is, so it can recognize them later on automatically. You only have to label a person once (or a few times, when the images are VERY different), and after being labelled once, it will auto-detect them in other images as well.

Installation

Get latest official release

This will get the latest officially released version from pypi.

python3 -mvenv ~/smartlocate/
source ~/smartlocate/bin/activate
pip3 install smartlocate

Run latest version

  1. Clone the repository:
   git clone --depth 1 https://github.com/NormanTUD/smartlocate.git
  1. Navigate to the directory and run the following command to install the tool:
cd smartlocate
./smartlocate --index --dir ~/Pictures

smartlocate will automatically install all necessary dependencies, and YOLO is already included. This is done on first execution, which may take some time. But this only has to be done once!

Usage

Indexing Images

To index images in a specific directory, run the following command:

smartlocate --dir /path/to/images --index

YOLO and an image description AI will be used to detect objects in images, and pandoc is used for indexing all kinds of documents, and the results will be stored in the database.

You need to re-run the index every time new images are added or changed.

Searching for Objects

To search for a specific object (e.g., "cat"), run the following command:

smartlocate cat

The tool will search the indexed images for the object and display the results.

Options

  • --index: Indexes images in the specified directory.
  • --size SIZE: Specifies the size to which images should be resized when indexing. Default is 400.
  • --dir DIR: Specifies the directory to search or index.
  • --debug: Enables debug mode to output detailed logs.
  • --no_sixel: Hide Sixel graphics.
  • --qrcodes: Enable indexing of qr-codes/search only qr-codes
  • --describe: Saves descriptions of images (generated by AI) as well and makes them searchable
  • --exact: Searches exactly what is entered, without splitting
  • --ocr: Enable OCR.
  • --documents: Enable documents.
  • --lang_ocr: OCR languages, default: de, en. Accepts multiple languages.
  • --delete_non_existing_files: Deletes non-existing files from the database.
  • --shuffle_index: Shuffles the list of files before indexing.
  • --model MODEL: Specifies the YOLO model for object detection.
  • --threshold THRESHOLD: Sets the confidence threshold for object detection (0-1).
  • --dbfile DBFILE: Specifies the path to the SQLite database file.
  • --exclude PATH: Excludes a path from indexing/searching. Can be used multiple times.
  • --dont_ask_new_faces: Don't ask for new faces (useful for automatically tagging all photos that can be tagged automatically).

Example Commands

Indexing images in a directory:

smartlocate --dir /home/user/images --index

Search for images containing the object "cat":

smartlocate cat

Indexing:

Indexing with YOLO, Description and OCR:

smartlocate --dir /home/user/images --index

Database

The results of image indexing are stored in the SQLite database ~/.smartlocate_db. This database contains information about detected objects in the images. The index must be re-run whenever new images are added or changes are made.

Manage single images

Simply run smartlocate /path/to/an/image/file.jpg to see an overview of the image file's data and modify it.

Requirements

  • Python 3.x
  • All python-dependencies will be automatically installed when the tool is first run.

Ideas

Future ideas would be to expand this to other formats than images as well. Imagine you could say:

smartlocate "text about cats"

and get all .txt, .md, .docx, .tex and so on files in which something about cats is written. Currently, document indexing is only done via a full-text search.

Same for videos and audio files. If someone wants to do it, feel free to contribute!

License

Licensed under GPL2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartlocate-2025.1.9.post2.tar.gz (33.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartlocate-2025.1.9.post2-py3-none-any.whl (78.5 kB view details)

Uploaded Python 3

File details

Details for the file smartlocate-2025.1.9.post2.tar.gz.

File metadata

  • Download URL: smartlocate-2025.1.9.post2.tar.gz
  • Upload date:
  • Size: 33.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for smartlocate-2025.1.9.post2.tar.gz
Algorithm Hash digest
SHA256 e703f161f4182cdd97da8e6b9213e0df835625b4e08de892b48596c1823b5e0a
MD5 4250251d2ddf4c3c692c80b8f950c896
BLAKE2b-256 28af8c317df298e6d8aae0438584a50efc4f1612ffe487aec1161871c899fb58

See more details on using hashes here.

Provenance

The following attestation bundles were made for smartlocate-2025.1.9.post2.tar.gz:

Publisher: python-publish.yml on NormanTUD/smartlocate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smartlocate-2025.1.9.post2-py3-none-any.whl.

File metadata

File hashes

Hashes for smartlocate-2025.1.9.post2-py3-none-any.whl
Algorithm Hash digest
SHA256 92665a7d1f24aa91a18a563b8958867fae22f4f5e6f14685af3be4e4e73a19fd
MD5 bd807c3e02f5ad217a3576bc7956f114
BLAKE2b-256 b5844db12fdd668e5b89683788221647f99cb962a762f44fa5e7f634e996eb53

See more details on using hashes here.

Provenance

The following attestation bundles were made for smartlocate-2025.1.9.post2-py3-none-any.whl:

Publisher: python-publish.yml on NormanTUD/smartlocate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page