A smart data augmentation tool for AI developers.
Project description
📌 databoostr
databoostr is a Python package designed for AI developers to automatically analyze and augment datasets when data is scarce. It supports both image and text augmentation techniques, making it an essential tool for machine learning projects requiring data balancing.
🚀 Features
- Automatic Data Analysis: Checks dataset distribution and identifies class imbalances.
- Image Augmentation:
- Rotation (90°, 180°, 270°)
- Horizontal & Vertical Flipping
- Brightness Adjustment
- Text Augmentation:
- Synonym Replacement
- Random Word Deletion
- Easy Integration: Simple API for applying augmentations automatically.
📦 Installation
pip install databoostr # (Future release)
For now, clone the repository:
git clone https://github.com/your-username/databoostr.git
cd databoostr
🛠 Usage
1. Import and Initialize
from databoostr import DataBoostr
# Create an instance for image augmentation
augmentor = DataBoostr(dataset_path="path/to/images", mode="image")
2. Check Dataset Balance
balance = augmentor.check_data_balance()
print(balance) # Output: {'class1': 120, 'class2': 80, 'class3': 50}
3. Apply Augmentation
For Images
augmentor.auto_augment() # Applies augmentation and saves images in the same directory
For Text
augmentor_text = DataBoostr(dataset_path="path/to/text", mode="text")
augmentor_text.auto_augment()
📁 Project Structure
databoostr/
│── databoostr.py # Main package module
│── utils.py # Data analysis utilities
│── image_augment.py # Image augmentation methods
│── text_augment.py # Text augmentation methods
│── __init__.py
🎯 Roadmap
- Add advanced augmentation techniques
- Implement custom augmentation strategies
- Publish on PyPI
- Integrate with TensorFlow & PyTorch
🤝 Contributing
We welcome contributions! Feel free to fork the repository and submit a pull request.
📜 License
MIT License © 2025 Your Name
🌟 Show Your Support
If you like databoostr, consider starring ⭐ the repository!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file databooster-0.1.0.tar.gz
.
File metadata
- Download URL: databooster-0.1.0.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
5e46442a8982a11319e5793a7822bc1fb4d430bc17d14446a81948d03b54f6d4
|
|
MD5 |
eab48848ecc13dd0b33427c12ff7b1a6
|
|
BLAKE2b-256 |
b7502eac61872f26fb62f12fc89cb7d84e47591d8c636827c712059f5674b308
|
File details
Details for the file databooster-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: databooster-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
40e0d68bae408e43cde199b30239b14ca991090be10dd38f33c56ffa66203e9f
|
|
MD5 |
187edeaf55e519c78eb0da13b83ee969
|
|
BLAKE2b-256 |
fde5cbc9bf8630eeb9f38935ada96e9eb7b0d7c7ec2b0e60b9ca25c3a84eee6b
|