Skip to main content

Functions for Prototyping, QOL and Sanity checking

Project description

Grimmerie

A spellbook for Python.

Grimmerie is a collection of high-level utilities (“spells”) for rapid prototyping, sanity checking, and removing friction from common ML and NLP workflows.

Each spell compresses a multi-step pipeline into a single call.


Installation

pip install grimmerie

Core Idea

Instead of wiring pipelines manually:

# lots of setup...

You do:

embeddings = specterize(data)

Behind that one call:

  • Input normalization (strings, dicts, lists, pandas Series)
  • Model loading and caching
  • Tokenization
  • Adapter activation
  • Output formatting

You get vectors. Immediately.


Input Philosophy

All spells accept:

  • str
  • dict
  • list
  • any iterable
  • pandas Series (first-class supported)

Example:

df["title"] + " " + df["abstract"]

goes straight in.


Spells

specterize

Generate SPECTER2 embeddings using Hugging Face Transformers + adapters.

from grimmerie import specterize

texts = [
    {"title": "BERT", "abstract": "We introduce a new model"},
    {"title": "Attention", "abstract": "Transformers dominate NLP"},
]

emb = specterize(texts, return_type="numpy")

Return Types

return_type = ["list", "numpy", "tensor"]
  • "list" → list[list[float]]
  • "numpy" → np.ndarray (n, 768)
  • "tensor" → torch.Tensor

tfidfize

Generate TF-IDF vectors using scikit-learn.

from grimmerie import tfidfize

X = tfidfize(df["title"] + " " + df["abstract"], return_type="array")

Return Types

return_type = ["sparse", "array", "list", "frame"]
  • "sparse" → scipy sparse matrix (default, best for large data)
  • "array" → np.ndarray (n, d)
  • "list" → list[list[float]]
  • "frame" → pandas DataFrame (columns = vocab)

Common Pattern

Both spells preserve row alignment:

X = tfidfize(df["title"], return_type="array")
E = specterize(df["title"], return_type="numpy")

# row i ↔ df.iloc[i]

Saving Outputs

Dense:

import numpy as np

X = tfidfize(df["title"], return_type="array")
np.save("tfidf.npy", X)

Sparse (recommended for TF-IDF):

from scipy import sparse

X = tfidfize(df["title"], return_type="sparse")
sparse.save_npz("tfidf.npz", X)

Design Principles

1. One-call workflows

You should not need to think about setup.

2. Strong defaults

Everything is preconfigured to “just work”.

3. Hidden complexity

Spells handle the annoying parts so you can focus on ideas.

4. Consistent interfaces

Same input patterns across spells.


When to Use Grimmerie

  • Rapid experimentation
  • Prototyping NLP pipelines
  • Testing ideas quickly
  • Building demos

When Not to Use It

  • You need full control over every step
  • You care about exact pipeline reproducibility
  • You are debugging low-level model behavior

Notes

  • First call may download models
  • Models are cached automatically
  • Inputs are normalized internally
  • Large TF-IDF outputs may be sparse

Direction

Grimmerie is evolving toward a unified set of spells for:

  • Vectorization
  • Dimensionality reduction
  • Visualization
  • Data inspection

All following the same idea:

result = spell(data)

Minimal Mental Model

data  normalize  compute  return vectors

That’s it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grimmerie-0.1.7.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

grimmerie-0.1.7-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file grimmerie-0.1.7.tar.gz.

File metadata

  • Download URL: grimmerie-0.1.7.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for grimmerie-0.1.7.tar.gz
Algorithm Hash digest
SHA256 40a5b63cde132ddeef6600e230b9c4d97b9c7e75011b02dfb6044009db58a7ca
MD5 59ae1debbca366c8ed8d852fb409e175
BLAKE2b-256 661ff1c30cca393e1bd62dce16a21342e2c5b4beed869c9fdbd4662992f3f52e

See more details on using hashes here.

File details

Details for the file grimmerie-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: grimmerie-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for grimmerie-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 d676f355c73a7adf8755434485f4dd3efe73c160810f73b2de516d6203bf20ef
MD5 1937a3ab8de14cba1485759fd5311685
BLAKE2b-256 b9df67624752c42644d597c9a205d4134eb68cde5daa2bef02d9e92345c39a00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page