Skip to main content

Some tools for building a Translator BigKG. This software project is experimental and unfinished.

Project description

stitch-proj

Some tools for building a Translator BigKG.
This software project is experimental and under active development.


Installation

From PyPI

pip install stitch-proj

For development

pip install stitch-proj[dev]

From source

git clone https://github.com/Translator-CATRAX/stitch-proj.git
cd stitch-proj
pip install -e .[dev]

Overview

There are two primary intended users of stitch-proj:

  1. Ingester A developer who wants to ingest the Babel concept identifier normalization database into a local SQLite database.

  2. Querier A developer building an application (e.g., a BigKG build system) who wants to programmatically query a local Babel SQLite database.


Package Structure

This project uses a src/ layout:

src/
  stitch_proj/
    ingest_babel.py
    local_babel.py
    row_counts.py
    stitchutils.py

Import the package as:

import stitch_proj

Tools

  • stitch_proj.ingest_babel Downloads and ingests the Babel database into a local SQLite database.

  • stitch_proj.local_babel Provides functions for querying a local Babel SQLite database.

  • stitch_proj.row_counts Prints table row counts for a local Babel SQLite database.


Running the Ingest

After installation, the console script is available:

ingest-babel --help

Or invoke via module:

python -m stitch_proj.ingest_babel --help

A full ingest requires:

  • CPython 3.12
  • At least 32 GiB RAM
  • ~600 GiB temporary disk space
  • ~200 GiB for the final SQLite database

A full ingest may take 30–40 hours depending on hardware.


Downloading a Pre-Built Babel Database

A pre-built SQLite file is available from S3:

https://rtx-kg2-public.s3.us-west-2.amazonaws.com/babel-20250331-p1.sqlite

Place it in a directory such as:

db/babel.sqlite

You can then use stitch_proj.local_babel to query it.


Running Tests

Ensure a valid babel.sqlite file exists locally, then run:

pytest -v

Some tests require internet connectivity.


Systems Tested

ingest_babel.py has been tested on:

  • Ubuntu 24.04 (x86_64, Intel Xeon)
  • Ubuntu 24.04 (ARM64, AWS Graviton3)
  • macOS 14 (Apple Silicon)

The package is pure Python and platform-independent, but large ingests require substantial memory and storage.


Development Workflow

Run linting, typing, and tests with:

pytest
ruff check .
mypy src

Or install development dependencies:

pip install -e .[dev]

License

MIT License. See LICENSE.


Citation

Please see the Babel project's CITATION.cff:

https://github.com/TranslatorSRI/Babel/blob/master/CITATION.cff

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stitch_proj-0.1.1.tar.gz (30.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stitch_proj-0.1.1-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file stitch_proj-0.1.1.tar.gz.

File metadata

  • Download URL: stitch_proj-0.1.1.tar.gz
  • Upload date:
  • Size: 30.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stitch_proj-0.1.1.tar.gz
Algorithm Hash digest
SHA256 65fcf41eeb2ff262a1e9c4cc4601b1fda29ea24ec26dcce2282b24ef677550a7
MD5 5a255688d986bfca4ada87cb8c63bb70
BLAKE2b-256 c23a7b78df33f918acb71133fa10fb31ff898a3ab99bc2a880fa28fac7e87da8

See more details on using hashes here.

File details

Details for the file stitch_proj-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: stitch_proj-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stitch_proj-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 216afb92da6ee1fbb289a963ec031a1cb14bc0f925a88ced3a60ae3781d68761
MD5 4a2a6e5f8191bbbfb77565bc0214238a
BLAKE2b-256 10e0c46e3234755d3f375900ee5f8c57c118a3896c659f3b1c41fa6c95f3919f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page