Skip to main content
Avatar for Michael Bommarito from gravatar.com

Michael Bommarito

Username    mjbommar
Date joined   Joined

23 projects

charstreamer

Last released

Fast Rust/PyO3 semantic text segmentation

shuflr-client

Last released

Python client for shuflr — HTTP NDJSON and shuflr-wire/1 binary transports

kernel-lore-mcp

Last released

MCP server exposing fast, structured search over lore.kernel.org to LLM developer tools.

folio-python

Last released

Python library for FOLIO, the Federated Open Legal Information Ontology

folio-mcp

Last released

MCP server for FOLIO, the Federated Open Legal Information Ontology

alea-llm-client

Last released

ALEA LLM client abstraction library for Python

alea-markdown

Last released

Convert HTML files to Markdown with configurable options (pure Python)

llm-detector

Last released

Transparent, probabilistic classification of text as human-generated or LLM-generated

npm-vuln-scanner

Last released

Detect compromised npm packages from the September 2025 supply chain attack

logillm

Last released

A generic, high-performance, low-dependency LLM programming framework inspired by dspy

pyenvsearch

Last released

Python library navigation and AI-powered analysis tool for developers and AI agents. Combines traditional code search with LLM-powered package insights.

nupunkt-rs

Last released

High-performance Rust implementation of nupunkt sentence/paragraph tokenization

nupunkt

Last released

Next-generation Punkt sentence and paragraph boundary detection with zero dependencies

cheesecloth

Last released

High-performance text metrics and filtering for large-scale corpora and pretrain curation

charboundary

Last released

Fast character-based boundary detection for sentence and paragraphs

kl3m-data-client

Last released

Client for interacting with KL3M data stored in S3 with JSON output support

alea-data-generator

Last released

ALEA low-level data generation techniques (procedural, KL3M)

soli-python

Last released

Python library for SOLI, the Standard for Open Legal Information

alea-preprocess

Last released

Efficient, accessible preprocessing routines for pretrain, SFT, and DPO training data preparation from the ALEA Institute.

alea-dublincore

Last released

ALEA Dublin Core Metadata library with zero dependencies

alea-data-resources

Last released

ALEA data resources library

soli-data-generator

Last released

Python library for SOLI data generation

rfcorr

Last released

Random Forest-inspired correlation/dependence methods

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page