Add your description here
Project description
ylem
Minimal package with a small, friendly Cleaner API for turning text into Markdown using Hugging Face models.
Quick Start
import ylem
# Use an alias ("nano"/"large") or a full HF repo id
cleaner = ylem.Cleaner("nano", system_prompt="Strength: Medium")
text = "some content from the internet"
markdown = cleaner(text) # returns generated text
Options (Cleaner)
- model: alias ("nano"/"large") or HF repo id (e.g.
google/gemma-3-270m-it). - aliases: optional dict to override aliases in code, e.g.
{ "nano": "your/nano" }. - system_prompt: default system prompt; can be overridden per call.
- suppress_warnings: hides most Transformers/HF Hub warnings (default: True).
- repetition_penalty: slight penalty to reduce repeats (default: 1.05). Set higher (e.g. 1.1) for stronger effect.
Per-call Overrides
out = cleaner(
text,
max_new_tokens=128,
repetition_penalty=1.1, # override instance default
do_sample=True,
temperature=0.7, # any transformers generation kwargs
system_prompt="Strength: High; Output Markdown only.",
)
Model Selection
- Built-in aliases map to Hugging Face IDs:
- "nano" ->
google/gemma-3-270m-it - "large" ->
google/gemma-3-270m-it(same by default)
- "nano" ->
- Override via env: set
YLEM_MODEL_NANO/YLEM_MODEL_LARGEto a different repo id. - Override in code:
ylem.Cleaner("nano", aliases={"nano": "your-org/your-model"}).
Logging & Stats
- Uses
logurufor pretty debug logs. - Logs include: prompt tokens, output tokens, prep time (template+encode), generation time, total, and tokens/sec, plus the effective repetition penalty.
- To reduce verbosity in your app:
from loguru import logger import sys logger.remove() logger.add(sys.stderr, level="INFO") # or "WARNING"
Warnings
- By default, most Transformers/HF Hub warnings are suppressed inside
Cleaner. - Disable suppression if you want full logs:
ylem.Cleaner(..., suppress_warnings=False).
Notes
- Backend: Uses
transformerstext-generation pipeline. Install a backend like PyTorch. - Chat template: Renders a simple system+user chat prompt using the tokenizer’s template.
- Defaults:
max_new_tokens=2048, return only generated continuation (not prompt). - Python: Requires Python 3.12+.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ylem-0.2.2.tar.gz.
File metadata
- Download URL: ylem-0.2.2.tar.gz
- Upload date:
- Size: 42.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86c7b3b2c00b14757f0be9b377b3f47fd34fb971304eb0862913273562824253
|
|
| MD5 |
c919af986058f2202713448a4a695049
|
|
| BLAKE2b-256 |
da7d83f75fac703191cd0fb8ae4cf8fec235eb36ff787059e5afb869c4db21bb
|
Provenance
The following attestation bundles were made for ylem-0.2.2.tar.gz:
Publisher:
python-publish.yaml on sumukshashidhar/ylem
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ylem-0.2.2.tar.gz -
Subject digest:
86c7b3b2c00b14757f0be9b377b3f47fd34fb971304eb0862913273562824253 - Sigstore transparency entry: 475443164
- Sigstore integration time:
-
Permalink:
sumukshashidhar/ylem@6d510c21d51ac85dd3b8f16f86ed8fccb2ec1b2c -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/sumukshashidhar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yaml@6d510c21d51ac85dd3b8f16f86ed8fccb2ec1b2c -
Trigger Event:
release
-
Statement type:
File details
Details for the file ylem-0.2.2-py3-none-any.whl.
File metadata
- Download URL: ylem-0.2.2-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
153ced225bea28662e749ef23a838cb82b084320e9a7f4a3bece4099d8e46c49
|
|
| MD5 |
5541b33efd6199c7dfad305f4a2f1345
|
|
| BLAKE2b-256 |
108d13af5eb77983605b47fcc850ed5ba55fc16bee83432fb81f4fdc2fb1f7a9
|
Provenance
The following attestation bundles were made for ylem-0.2.2-py3-none-any.whl:
Publisher:
python-publish.yaml on sumukshashidhar/ylem
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ylem-0.2.2-py3-none-any.whl -
Subject digest:
153ced225bea28662e749ef23a838cb82b084320e9a7f4a3bece4099d8e46c49 - Sigstore transparency entry: 475443194
- Sigstore integration time:
-
Permalink:
sumukshashidhar/ylem@6d510c21d51ac85dd3b8f16f86ed8fccb2ec1b2c -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/sumukshashidhar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yaml@6d510c21d51ac85dd3b8f16f86ed8fccb2ec1b2c -
Trigger Event:
release
-
Statement type: