Some tools for building a Translator BigKG. This software project is experimental and unfinished.
Project description
stitch-proj
Some tools for building a Translator BigKG.
This software project is experimental and under active development.
Installation
From PyPI
pip install stitch-proj
For development
pip install stitch-proj[dev]
From source
git clone https://github.com/Translator-CATRAX/stitch-proj.git
cd stitch-proj
pip install -e .[dev]
Overview
There are two primary intended users of stitch-proj:
-
Ingester A developer who wants to ingest the Babel concept identifier normalization database into a local SQLite database.
-
Querier A developer building an application (e.g., a BigKG build system) who wants to programmatically query a local Babel SQLite database.
Package Structure
This project uses a src/ layout:
src/
stitch_proj/
ingest_babel.py
local_babel.py
row_counts.py
stitchutils.py
Import the package as:
import stitch_proj
Tools
-
stitch_proj.ingest_babelDownloads and ingests the Babel database into a local SQLite database. -
stitch_proj.local_babelProvides functions for querying a local Babel SQLite database. -
stitch_proj.row_countsPrints table row counts for a local Babel SQLite database.
Running the Ingest
After installation, the console script is available:
ingest-babel --help
Or invoke via module:
python -m stitch_proj.ingest_babel --help
A full ingest requires:
- CPython 3.12
- At least 32 GiB RAM
- ~600 GiB temporary disk space
- ~200 GiB for the final SQLite database
A full ingest may take 30–40 hours depending on hardware.
Downloading a Pre-Built Babel Database
A pre-built SQLite file is available from S3:
https://rtx-kg2-public.s3.us-west-2.amazonaws.com/babel-20250331-p1.sqlite
Place it in a directory such as:
db/babel.sqlite
You can then use stitch_proj.local_babel to query it.
Running Tests
Ensure a valid babel.sqlite file exists locally, then run:
pytest -v
Some tests require internet connectivity.
Systems Tested
ingest_babel.py has been tested on:
- Ubuntu 24.04 (x86_64, Intel Xeon)
- Ubuntu 24.04 (ARM64, AWS Graviton3)
- macOS 14 (Apple Silicon)
The package is pure Python and platform-independent, but large ingests require substantial memory and storage.
Development Workflow
Run linting, typing, and tests with:
pytest
ruff check .
mypy src
Or install development dependencies:
pip install -e .[dev]
License
MIT License. See LICENSE.
Citation
Please see the Babel project's CITATION.cff:
https://github.com/TranslatorSRI/Babel/blob/master/CITATION.cff
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stitch_proj-0.1.1.tar.gz.
File metadata
- Download URL: stitch_proj-0.1.1.tar.gz
- Upload date:
- Size: 30.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65fcf41eeb2ff262a1e9c4cc4601b1fda29ea24ec26dcce2282b24ef677550a7
|
|
| MD5 |
5a255688d986bfca4ada87cb8c63bb70
|
|
| BLAKE2b-256 |
c23a7b78df33f918acb71133fa10fb31ff898a3ab99bc2a880fa28fac7e87da8
|
File details
Details for the file stitch_proj-0.1.1-py3-none-any.whl.
File metadata
- Download URL: stitch_proj-0.1.1-py3-none-any.whl
- Upload date:
- Size: 28.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
216afb92da6ee1fbb289a963ec031a1cb14bc0f925a88ced3a60ae3781d68761
|
|
| MD5 |
4a2a6e5f8191bbbfb77565bc0214238a
|
|
| BLAKE2b-256 |
10e0c46e3234755d3f375900ee5f8c57c118a3896c659f3b1c41fa6c95f3919f
|