A toolkit for LLM data augmentation
Project description
🚀 HypeLLM: Hypothetical LLM Data Augmentation
🌟 Features
- 🎮 Recipe-Based Augmentation: Pre-built recipes for common augmentation patterns
- 🔄 Multiple Strategies: Infer patterns, add reasoning, generate questions, and more
- 🎯 Async & Sync Support: Choose between async or sync APIs based on your needs
- ⚡ Flexible Implementation: Swap between different LLM backends (instructor, dspy, etc.)
🛠️ Installation
pip install hypellm
rye add hypellm
poetry add hypellm
Note that out of the box, you'll also need to install instructor as a peer dependency of hypellm, as it is the default implementation.
🚀 Quick Start
Take a look at recipes.py to learn what's available and how they work.
# env vars:
# HYPELLM_MODEL=gpt-4o # LiteLLM model name
# HYPELLM_API_KEY=your_api_key
import hypellm
# Your training examples, simple strings or structured data
data = [
# Example(inputs: dict | str, reasoning: list[str], outputs: dict | str)
hypellm.Example(
inputs="The patient presents with elevated troponin levels (0.8 ng/mL) and ST-segment depression, but no chest pain or dyspnea.",
outputs="unstable_angina"
),
hypellm.Example(
inputs="Labs show WBC 15k/μL with 80% neutrophils, fever 39.2°C, and consolidation in right lower lobe on chest X-ray.",
outputs="bacterial_pneumonia"
)
]
# Choose your implementation
hypellm.settings.impl_name = "instructor" # or your custom impl
# Use different recipes
async def augment_examples():
# Infer a prompt from examples
prompt = await hypellm.recipes.inferred(data)
print(f"Intent: {prompt.intent}")
print(f"Do's: {prompt.dos}")
print(f"Don'ts: {prompt.donts}")
print(f"Reasoning Steps: {prompt.reasoning_steps}")
print(f"Examples: {prompt.examples}")
# Add reasoning steps to examples
prompt, results = await hypellm.recipes.reasoned(data, prompt=prompt)
for result in results:
print(f"Q: {result.inputs}")
print(f"Reasoning: {result.reasoning}")
print(f"A: {result.outputs}")
# Generate questions from different angles
questions = await hypellm.recipes.questions(data)
for question, data_that_answers_question in questions.items():
print(f"{question}: {data_that_answers_question}")
# Invert input/output pairs
prompt, inverted = await hypellm.recipes.inverted(data)
print(f"Inverted prompt: {prompt}")
for result in inverted:
print(f"Original: {result.outputs} -> [{result.reasoning}] -> {result.inputs}")
print(f"Inverted: {result.inputs} -> [{result.reasoning}] -> {result.outputs}")
🤝 Contributing
We welcome contributions! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
📝 License
MIT License - see the LICENSE file for details.
Made with 🔥 by Zenbase AI - Empowering the next generation of LLM applications.
Remember: Better data means better models! 🎯
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hypellm-0.0.4.tar.gz.
File metadata
- Download URL: hypellm-0.0.4.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
112e6d096aa66b2280827da643fc1aee53b93f9f963bb8808175cd1997beeb7a
|
|
| MD5 |
1b8b8e038105f4492b700c3934acaebc
|
|
| BLAKE2b-256 |
749d0c45faa44b0bfdbfe113b1bec252bcbe142bfd7fbce1b8c6ad91f8cc7f9d
|
File details
Details for the file hypellm-0.0.4-py3-none-any.whl.
File metadata
- Download URL: hypellm-0.0.4-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43060dde720e2f7a4d06b7f3a2387e168632c5249334b17215f46a62d00d8d12
|
|
| MD5 |
0f69b46fb67f0ab7d7d02093d79138b1
|
|
| BLAKE2b-256 |
ca337c915ba878cf3a792bc4898fe194d01490f90ee621604dababd5d53fc277
|