Functions for Prototyping, QOL and Sanity checking
Project description
Grimmerie
A spellbook for Python.
Grimmerie is a collection of high-level utilities (“spells”) for rapid prototyping, sanity checking, and removing friction from common ML and NLP workflows.
Each spell compresses a multi-step pipeline into a single call.
Installation
pip install grimmerie
Core Idea
Instead of wiring pipelines manually:
# lots of setup...
You do:
embeddings = specterize(data)
Behind that one call:
- Input normalization (strings, dicts, lists, pandas Series)
- Model loading and caching
- Tokenization
- Adapter activation
- Output formatting
You get vectors. Immediately.
Input Philosophy
All spells accept:
strdictlist- any iterable
- pandas Series (first-class supported)
Example:
df["title"] + " " + df["abstract"]
goes straight in.
Spells
specterize
Generate SPECTER2 embeddings using Hugging Face Transformers + adapters.
from grimmerie import specterize
texts = [
{"title": "BERT", "abstract": "We introduce a new model"},
{"title": "Attention", "abstract": "Transformers dominate NLP"},
]
emb = specterize(texts, return_type="numpy")
Return Types
return_type = ["list", "numpy", "tensor"]
- "list" → list[list[float]]
- "numpy" → np.ndarray (n, 768)
- "tensor" → torch.Tensor
tfidfize
Generate TF-IDF vectors using scikit-learn.
from grimmerie import tfidfize
X = tfidfize(df["title"] + " " + df["abstract"], return_type="array")
Return Types
return_type = ["sparse", "array", "list", "frame"]
- "sparse" → scipy sparse matrix (default, best for large data)
- "array" → np.ndarray (n, d)
- "list" → list[list[float]]
- "frame" → pandas DataFrame (columns = vocab)
Common Pattern
Both spells preserve row alignment:
X = tfidfize(df["title"], return_type="array")
E = specterize(df["title"], return_type="numpy")
# row i ↔ df.iloc[i]
Saving Outputs
Dense:
import numpy as np
X = tfidfize(df["title"], return_type="array")
np.save("tfidf.npy", X)
Sparse (recommended for TF-IDF):
from scipy import sparse
X = tfidfize(df["title"], return_type="sparse")
sparse.save_npz("tfidf.npz", X)
Design Principles
1. One-call workflows
You should not need to think about setup.
2. Strong defaults
Everything is preconfigured to “just work”.
3. Hidden complexity
Spells handle the annoying parts so you can focus on ideas.
4. Consistent interfaces
Same input patterns across spells.
When to Use Grimmerie
- Rapid experimentation
- Prototyping NLP pipelines
- Testing ideas quickly
- Building demos
When Not to Use It
- You need full control over every step
- You care about exact pipeline reproducibility
- You are debugging low-level model behavior
Notes
- First call may download models
- Models are cached automatically
- Inputs are normalized internally
- Large TF-IDF outputs may be sparse
Direction
Grimmerie is evolving toward a unified set of spells for:
- Vectorization
- Dimensionality reduction
- Visualization
- Data inspection
All following the same idea:
result = spell(data)
Minimal Mental Model
data → normalize → compute → return vectors
That’s it.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grimmerie-0.1.7.tar.gz.
File metadata
- Download URL: grimmerie-0.1.7.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40a5b63cde132ddeef6600e230b9c4d97b9c7e75011b02dfb6044009db58a7ca
|
|
| MD5 |
59ae1debbca366c8ed8d852fb409e175
|
|
| BLAKE2b-256 |
661ff1c30cca393e1bd62dce16a21342e2c5b4beed869c9fdbd4662992f3f52e
|
File details
Details for the file grimmerie-0.1.7-py3-none-any.whl.
File metadata
- Download URL: grimmerie-0.1.7-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d676f355c73a7adf8755434485f4dd3efe73c160810f73b2de516d6203bf20ef
|
|
| MD5 |
1937a3ab8de14cba1485759fd5311685
|
|
| BLAKE2b-256 |
b9df67624752c42644d597c9a205d4134eb68cde5daa2bef02d9e92345c39a00
|