Lightweight code search for your own projects.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TNick

These details have not been verified by PyPI

Development Status
- 3 - Alpha
License
- Other/Proprietary License
Operating System
- OS Independent
Programming Language
- Python :: 3 :: Only
Typing
- Typed

Project description

find-stuff

Lightweight code search for your own projects.

The library scans your folders for Git repositories, indexes the files you choose (by extension), and lets you search quickly from the terminal or from Python. It uses a simple SQLite database with an inverted index built from tokens found in your files. The CLI is friendly, the internals are small and typed, and everything runs locally.

What it is good for

Fast grep-like queries across many repos without opening an editor
Exact or regex term matching, case-sensitive or not
“All terms” vs “Any term” logic
Limiting results and filtering by file extensions at search time

Install

The steps below are written for beginners. They show how to:

Install Python
Create a private “virtual environment”
Get the project from GitHub
Install it into your environment and run it

You only need to do this once on your computer. After that, you can just activate the environment and use the tool.

1) Install Python (version 3.11 or newer)

Windows:
- Go to the official Python website: https://www.python.org/downloads/
- Download “Python 3.x” for Windows and run the installer.
- Important: On the first screen, check the box “Add Python to PATH”, then click Install.
- After install, open PowerShell and type:
```
python --version
```
  You should see something like Python 3.11.8 (any 3.11+ is fine).
macOS:
- Visit https://www.python.org/downloads/ and install the latest 3.x for macOS.
- Open Terminal and type python3 --version to confirm.

Linux (Ubuntu/Debian):

Open Terminal and run:

sudo apt update && sudo apt install -y python3 python3-venv python3-pip

Confirm with: python3 --version

2) Create a virtual environment (keeps things clean)

Pick a folder where you want to keep this project (for example, D:\tools\find-stuff on Windows or ~/tools/find-stuff on macOS/Linux). Then:

Windows PowerShell:

python -m venv .venv
. .venv\Scripts\Activate.ps1

macOS/Linux Terminal:

python3 -m venv .venv
source .venv/bin/activate

If activation worked, your prompt will show (.venv) at the start. While this is active, anything you install stays private to this folder.

3) Get the project from GitHub

If you have Git installed, you can clone the repository. If not, you can click the green “Code” button on GitHub and download the ZIP, then unzip it into your chosen folder.

Using Git (recommended):

git clone https://github.com/pyl1b/find-stuff.git
cd find-stuff

4) Install the tool into your environment

With the virtual environment still active and inside the find-stuff folder, run:

Windows PowerShell:

python -m pip install --upgrade pip
python -m pip install -e .

macOS/Linux:

python3 -m pip install --upgrade pip
python3 -m pip install -e .

This installs the library and the find-stuff command.

5) Try it out

Show the help to confirm it’s installed:

find-stuff --help

Later, when you come back to use the tool again, just re-activate the environment (step 2) and you’re ready.

TL;DR

Build the index under a root folder, choosing the extensions you care about:

find-stuff rebuild-index D:\code --db D:\code\.find_stuff\index.sqlite3 --ext py --ext md

Search it:

find-stuff search --db D:\code\.find_stuff\index.sqlite3 foo bar

CLI

All commands share logging flags: --debug/--no-debug, --trace/--no-trace, and --log-file to redirect logs. Version is available via --version.

rebuild-index

Recreate the database from scratch by scanning for Git repositories under a root and indexing tracked files of the given extensions.

# Index only Python files under the root directory
find-stuff rebuild-index D:\projects --ext py

# Index multiple extensions, writing DB to a custom path
find-stuff rebuild-index D:\work --db D:\work\.find_stuff\index.sqlite3 --ext py --ext md --ext txt

add-to-index

Append newly found repositories and files without wiping existing data.

# Add new repos under the same root into an existing DB
find-stuff add-to-index D:\work --db D:\work\.find_stuff\index.sqlite3 --ext py

search

Query the index for files containing terms. By default, a result must contain all terms. Use --any to match if any term is present. Use --regex to interpret terms as regular expressions. --case-sensitive controls case sensitivity. Use --limit to cap results and --ext to filter by extension at search time.

# Require all terms (default)
find-stuff search --db D:\work\.find_stuff\index.sqlite3 foo bar

# Match if any term is present
find-stuff search --db D:\work\.find_stuff\index.sqlite3 --any foo bar

# Regex search, case-insensitive
find-stuff search --db D:\work\.find_stuff\index.sqlite3 --regex --ignore-case class(Name)?

# Restrict results to Markdown files
find-stuff search --db D:\work\.find_stuff\index.sqlite3 --ext md token

# Limit to top 10 hits
find-stuff search --db D:\work\.find_stuff\index.sqlite3 --limit 10 http

Output format:

<score>\t<absolute-path>

Where score is the number of matched postings (occurrences) contributing to the match.

browse

Interactively navigate indexed repositories, their directories, and files. Now powered by InquirerPy with fuzzy filtering and colored prompts.

find-stuff browse --db D:\\work\\.find_stuff\\index.sqlite3

Options:

--color/--no-color: enable/disable colored text in prompts and output

Controls:

Up/Down: move selection
Type to filter: fuzzy match across items
Enter: select item
Change repository: switch to a different repo
Open this directory in VS Code: launches code in current folder
Back from file view: return from file details
Quit: exit the browser

When you select a file, the tool shows database metadata and whether the file has been modified (mtime and hash comparison). Times are printed in a human‑readable local format (YYYY‑MM‑DD HH:MM:SS).

Library usage

You can also use the Python API:

from pathlib import Path
from find_stuff.indexing import rebuild_index, add_to_index, search_files

root = Path(r"D:\\work")
db = Path(r"D:\\work\\.find_stuff\\index.sqlite3")

rebuild_index(root, db, file_types=("py", "md"))

results = search_files(
    db,
    ["alpha", "beta"],
    require_all_terms=True,
    regex=False,
    case_sensitive=False,
)

for path, score in results:
    print(score, path)

Database

This project keeps a compact SQLite database that behaves like a local card catalog for your code. Each table captures a different aspect of “where did we look” and “what did we find.” The shape is intentionally simple, so you can inspect it with any SQLite browser.

Table: repositories

Think of this as the shelf registry. Each row represents a Git repository discovered under your chosen root folder. It remembers only the absolute path of that repository’s root. When you scan again, the tool checks this registry to avoid duplicating shelves. There’s an internal numeric label for each shelf, used by other tables to say “this file came from that shelf.”

Table: files

This is the card catalog of individual files. For every file that Git tracks and that matches your chosen extensions, we remember a few simple things: which shelf it belongs to, the neat little path it has inside that shelf so you can find it again, and the full location on disk that points straight to the file. Together, these columns say “this precise file, from that repository, lives here.” The catalog ensures that the same file isn’t listed twice within the same repository.

Table: tokens

Imagine a dictionary of every distinct word-like fragment we encountered across all indexed files. Each entry keeps the exact spelling it had when we saw it, along with a quiet, lowercased twin that helps us match things without worrying about capitalization. The important part is that every different fragment appears only once in this dictionary and gets its own stable identifier for quick cross-referencing.

Table: postings

This is the map of where each fragment shows up. For a given file and a given fragment from the dictionary, we keep a precise spot where it appears: which line it’s on and where it starts on that line. If the same fragment appears multiple times in one file, each spot is recorded separately. By tying together file, fragment, and position, this table is what lets searches be fast and precise. Internally, the combination of file, fragment, line and column uniquely identifies each occurrence.

Table: metadata

This is a tiny drawer for housekeeping notes. It stores small labeled values that describe the index itself, such as configuration details or versioning information if needed in the future. It’s intentionally minimal and meant for tool-level notes rather than content.

Indexing and re-scanning

Building the index is a walk through your chosen root, looking for Git repositories by spotting their .git markers. Each repository is treated as its own island. For each island, the tool asks Git which files are actually tracked, and then keeps only the ones whose extensions you selected. Every file is opened softly with UTF‑8 and errors ignored, then broken into simple word-like tokens: sequences that look like names in code or natural text. For every token we encounter, we jot down where it occurred in the file by line and column. The token dictionary is expanded as needed, and each occurrence is stored in the postings map. When you run a full rebuild, the previous database is freshly created so the catalog reflects exactly what you scanned. When you add to the index, the process is gentler: already-known repositories are skipped, and only new ones are appended so your catalog grows without being wiped.

How search works

Searching starts by translating your terms into entries in the token dictionary. If you request exact matching, the translation is a straight look‑up, either in original form or in the lowercased variant when you prefer to ignore case. If you switch to regular expressions, the system takes a quick stroll through the dictionary and keeps the entries that satisfy your pattern, using your case preference. Once terms are resolved to token entries, the search narrows down files. If you asked for all terms, it first finds the set of files that contain the first term and keeps trimming that set by checking the others, ending with only the files that have every term. If you asked for any term, it simply collects all files that have at least one of them. Finally, it counts how many relevant occurrences contribute to each file and orders results from most to least evidence. Optional filters, such as limiting to specific extensions, are applied near the end so you can fine‑tune the list without rebuilding the index.

Developing

This project is small on purpose and aims for a pleasant contributor experience.

Requirements

Python 3.11+
Git available on PATH for real-world runs (tests mock it)

Setup

python -m venv .venv
. .venv/Scripts/Activate.ps1   # on Windows PowerShell
pip install -e .[dev]

Common tasks

# Format
make format

# Lint
make lint

# Tests (type-check + pytest)
make test

# Fix simple lint issues automatically
make delint

The CLI entry point is find_stuff.__main__:cli and can be invoked as:

python -m find_stuff --help

Project conventions

Typed code, small modules, clear names
Prefer stdlib and a minimal set of dependencies
Follow ruff formatting and linting configuration in pyproject.toml
Keep public APIs stable; if you change them, update CHANGELOG.md

Release

On the local machine create a package and test it.

pip install build twine
python -m build
twine check dist/*

Change ## [Unreleased] to the name of the new version in CHANGELOG.md, then create a commit, then create a new tag and push it to GitHub:

git add .
git commit -m "Release version 0.1.0"

git tag -a v0.1.0 -m "Release version 0.1.0"

git push origin v0.1.0
# or
git push origin --tags

In the GitHub repository page create a new Release. This will trigger the workflow for publishing in PyPi.

License

BSD-3-Clause

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TNick

These details have not been verified by PyPI

Development Status
- 3 - Alpha
License
- Other/Proprietary License
Operating System
- OS Independent
Programming Language
- Python :: 3 :: Only
Typing
- Typed

Release history Release notifications | RSS feed

This version

0.1.4

Sep 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

find_stuff-0.1.4.tar.gz (41.2 kB view details)

Uploaded Sep 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

find_stuff-0.1.4-py3-none-any.whl (34.0 kB view details)

Uploaded Sep 18, 2025 Python 3

File details

Details for the file find_stuff-0.1.4.tar.gz.

File metadata

Download URL: find_stuff-0.1.4.tar.gz
Upload date: Sep 18, 2025
Size: 41.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for find_stuff-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`d9626ee80cb43a18868e388ddf07080dbd3e878c6fd8cfe5fb1fb41bc74a8589`
MD5	`b1c5633c6e53eb89afe531ed98f62357`
BLAKE2b-256	`cb0ef29068424bae269ff976da3d99411cd92d7f632045d3b2a48050151bf27e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for find_stuff-0.1.4.tar.gz:

Publisher: publish.yml on pyl1b/find-stuff

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: find_stuff-0.1.4.tar.gz
- Subject digest: d9626ee80cb43a18868e388ddf07080dbd3e878c6fd8cfe5fb1fb41bc74a8589
- Sigstore transparency entry: 533354210
- Sigstore integration time: Sep 18, 2025
Source repository:
- Permalink: pyl1b/find-stuff@9abdd5b48ea6a32372b775db28f651926c34c360
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/pyl1b
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9abdd5b48ea6a32372b775db28f651926c34c360
- Trigger Event: release

File details

Details for the file find_stuff-0.1.4-py3-none-any.whl.

File metadata

Download URL: find_stuff-0.1.4-py3-none-any.whl
Upload date: Sep 18, 2025
Size: 34.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for find_stuff-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f38d1b49beff2a956e8f5395d82972eba8ab3266ff01c2f44ef1c3e7ad2588d6`
MD5	`5c413d5c273685d9fb9c3a61e957b9c3`
BLAKE2b-256	`5f23c961cb0e9a7ae35c960c6564484756137efa8e4380f4060f61ddd973c908`

See more details on using hashes here.

Provenance

The following attestation bundles were made for find_stuff-0.1.4-py3-none-any.whl:

Publisher: publish.yml on pyl1b/find-stuff

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: find_stuff-0.1.4-py3-none-any.whl
- Subject digest: f38d1b49beff2a956e8f5395d82972eba8ab3266ff01c2f44ef1c3e7ad2588d6
- Sigstore transparency entry: 533354225
- Sigstore integration time: Sep 18, 2025
Source repository:
- Permalink: pyl1b/find-stuff@9abdd5b48ea6a32372b775db28f651926c34c360
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/pyl1b
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9abdd5b48ea6a32372b775db28f651926c34c360
- Trigger Event: release

find-stuff 0.1.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

find-stuff

What it is good for

Install

1) Install Python (version 3.11 or newer)

2) Create a virtual environment (keeps things clean)

3) Get the project from GitHub

4) Install the tool into your environment

5) Try it out

TL;DR

CLI

rebuild-index

add-to-index

search

browse

Library usage

Database

Table: repositories

Table: files

Table: tokens

Table: postings

Table: metadata

Indexing and re-scanning

How search works

Developing

Requirements

Setup

Common tasks

Project conventions

Release

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance