Functions for Prototyping, QOL and Sanity checking
Project description
Grimmerie
A spellbook for Python.
Grimmerie is a collection of high-level utilities (“spells”) designed for rapid prototyping, sanity checking, and reducing friction in experimentation.
Each spell performs a non-trivial amount of work under the hood.
They are intentionally designed to trade fine-grained control for speed, clarity, and momentum.
Use them when you want to move fast.
Understand them before you rely on them.
Installation
pip install grimmerie
The Idea
Instead of wiring together pipelines every time, Grimmerie gives you:
- One function call
- Sensible defaults
- Heavy lifting handled internally
Example philosophy:
embeddings = specterize(papers)
Behind this single call:
- Model loading
- Tokenization
- Batching
- Device handling
- Adapter loading
- Output formatting
All handled for you.
Spells
specterize
Generate SPECTER2 embeddings from text or paper-like inputs.
from grimmerie import specterize
papers = [
{'abstract': 'We introduce a new language representation model called BERT'},
{'abstract': 'The dominant sequence transduction models are based on neural networks'},
]
embeddings = specterize(papers, return_type='numpy')
tfidfize
Generate TF-IDF representations from text with optional preprocessing.
from grimmerie import tfidfize
docs = [
{'abstract': 'We introduce a new language representation model called BERT'},
{'abstract': 'The dominant sequence transduction models are based on neural networks'},
]
X = tfidfize(docs, return_type='array')
tfidfize(
input_data,
lemmatize: bool = False,
spacy_model: str = 'en_core_web_sm',
batch_size: int = 2000,
n_process: int = 1,
progress_interval: int | None = None,
min_df: int | float = 1,
max_df: int | float = 1.0,
stop_words: str | list[str] | None = 'english',
ngram_range: tuple[int, int] = (1, 1),
lowercase: bool = True,
max_features: int | None = None,
norm: Literal['l1', 'l2'] | None = 'l2',
use_idf: bool = True,
smooth_idf: bool = True,
sublinear_tf: bool = False,
return_type: Literal['sparse', 'array', 'list', 'frame'] = 'sparse',
return_vectorizer: bool = False,
vectorizer: TfidfVectorizer | None = None,
)
Parameters:
lemmatize: Apply lemmatization (defaultFalse)spacy_model: Spacy model for lemmatization (default'en_core_web_sm')batch_size: Processing batch size (default2000)n_process: Number of processes (default1)progress_interval: Progress reporting intervalmin_df: Minimum document frequency (default1)max_df: Maximum document frequency (default1.0)stop_words: Stop words to filter (default'english')ngram_range: N-gram range (default(1, 1))lowercase: Convert to lowercase (defaultTrue)max_features: Maximum vocabulary sizenorm: Normalization method (default'l2')use_idf: Enable IDF weighting (defaultTrue)smooth_idf: Smooth IDF values (defaultTrue)sublinear_tf: Apply sublinear TF scaling (defaultFalse)return_type: Output format (default'sparse')return_vectorizer: Return fitted vectorizer (defaultFalse)vectorizer: Pre-fitted TfidfVectorizer instance
API
specterize(input_data, return_type='list', max_length=512)
input_data:str,dict,list, or iterablereturn_type:"list","numpy","tensor"max_length: tokenizer truncation length (default512)
Design Principles
1. Abstraction over configuration
You should not need to think about setup for common workflows.
2. Strong defaults
Spells are opinionated. They are built to “just work” for most cases.
3. Hidden complexity
A spell may do significantly more than it appears.
4. Use with awareness
Because complexity is hidden, you should understand what a spell does before using it in critical systems.
When to Use Grimmerie
- Rapid experimentation
- Prototyping ML/NLP pipelines
- Sanity checking ideas
- Building quick demos
When Not to Use It
- When you need full control over every step
- When reproducibility requires explicit pipelines
- When debugging low-level behavior
Notes
- First call may be slower due to model downloads
- Models are cached locally after first use
- Subsequent calls reuse loaded resources within the same process
Direction
Grimmerie will expand into a broader system of spells for:
- Vectorization
- Dimensionality reduction
- Visualization
- Data inspection
- ML prototyping utilities
Each designed to compress multi-step workflows into a single, intentional call.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grimmerie-0.1.6.tar.gz.
File metadata
- Download URL: grimmerie-0.1.6.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4b2933bbde62dac4bec46fc74280e3f6e17b9a165c9c84dce1897f75f792735
|
|
| MD5 |
0f65f2ee054c7d5f924b3364b413c899
|
|
| BLAKE2b-256 |
5d189d87ae6a3b635b28b09b622a53cf8e7cccb217114f526ffec4c6637bb7b2
|
File details
Details for the file grimmerie-0.1.6-py3-none-any.whl.
File metadata
- Download URL: grimmerie-0.1.6-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38ccf814aef11f2c27cac04ef41807326949de2be9b7337b31f8203b00511bd2
|
|
| MD5 |
73861eac0e63440321b91c1a2acc6982
|
|
| BLAKE2b-256 |
074ec426ae99a199c9998c8d273a8674bcf6745701053f1fdf95133f1d93d9a4
|