Unibox provides unified interface for common file operations.
Project description
unibox
unibox provides unified interface for common file operations.
Quick Start
# pip install unibox
import unibox as ub
some common use cases of unibox includes:
loading various file types in the same way:
- supports json, txt, images, parquet, csv, feather, ....
- uses appropriate best practices (such as
orjson
package for json) for speed ups
some_dict = ub.loads("some_file.json") # json → dict
some_list = ub.loads("some_file.txt") # txt → list[str]
some_img = ub.loads("some_image.jpg") # webp/jpg/png/..etc → PIL.Image
some_df = ub.loads("some_data.parquet") # parquet/csv/feather → pd.Dataframe
# .... for more: see uni_loader.py#L40
saving various python data structure in the same way:
- similar as
ub.loads
but also for saving files
ub.saves(some_dict, "some_file.json") # similar as above
ub.saves(some_df, "some_df.parquet")
list s3 or local directories in the same way:
- default optional params:
relative_unix=True, debug_print=True
- optimized
s3 ls
speed compared to boto3
files_under_dir = ub.traverses("/home/ubuntu/data") # list local file
# needs to have `aws configure` pre-configured
files_under_s3 = ub.traverses("s3://dataset-pixiv/resized_1572864") # list s3 files
simplified logger class for easier debug:
- a logger with functionalities pre-configured
- includes caller frame info, emoji warnings, datetime, and more
import unibox as ub
logger = ub.UniLogger()
def some_function():
logger.info("some info")
# logger.warning("....")
# logger.error("....")
some_function()
# 2024-05-08 17:57:23,149 [INFO] UniLogger: some_function: some info
resize millions of images efficiently:
- (pre-configured omitted here for simplicity; saves to 98% quality WEBP by default)
- also able to resize by minimum or maximum of side lengths,
# root_dir: where the images to be resized are
target_pixels = int(1024 * 1024 * 1.5)
resizer = ub.UniResizer(root_dir, dst_dir, target_pixels)
# get resize jobs
images_to_resize = resizer.get_resize_jobs()
# execute resize jobs
resizer.execute_resize_jobs(images_to_resize)
Install
install from pypi:
pip install unibox
build from source:
git clone https://github.com/trojblue/unibox
# pip install poetry
poetry install
poetry build
pip install dist/unibox-<version number>.whl
[OLD DOC] Features
The package is designed to be running with python 3.10, but targets 3.8+ for compatibility:
CLI:
unibox resize <dir>
: resizes a directory of images using eitherpillow
orlibvips
- customizable size / quality / encoding (png / webp / jpeg)
unibox copy <dir>
: an awscli-like tool for copying files with certain suffix to a new dir, keeping the same directory structure.- bypasses windows explorer so it's much faster.
unibox move <dir>
: likecopy
, but moves instead
utils:
UniLogger
: uniformed logger class (logger = unibox.UniLogger()
, and uselogger.info(...)
)UniLoader
: uniformed data loader class (unibox.loads(<filename>)
)UniSaver
: uniformed data saver class (unibox.saves(<data>, <filename>)
)UniTraverser
: uniformed directory traverser class, with callbacks in multiple stagesUniResizer
: uniformed image resizer class, with callbacks in multiple stages
callables:
unibox.traverses(dir, include, exclude, relative_unix)
: traverse a directory using specified exclude / include extensions, and return a list of filesunibox.loads(filepath)
: load arbitrary data from a file into suitable formats, with automatic detection of file type- supported formats: see UniLoader class implementation
unibox.saves(data, filepath)
: saves arbitrary data to a file, with automatic detection of file type
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unibox-0.4.6.tar.gz
(24.3 kB
view details)
Built Distribution
unibox-0.4.6-py3-none-any.whl
(28.9 kB
view details)
File details
Details for the file unibox-0.4.6.tar.gz
.
File metadata
- Download URL: unibox-0.4.6.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6974debc73a0ca6715f64b192c62bc538a546e96127895629ef975b86b6f8b89 |
|
MD5 | 6bd75986ed170119ed97df01171c2f77 |
|
BLAKE2b-256 | 920ba795d42e49e623485fecc777117f4b6d6e054403247cc99522e89ba8b248 |
File details
Details for the file unibox-0.4.6-py3-none-any.whl
.
File metadata
- Download URL: unibox-0.4.6-py3-none-any.whl
- Upload date:
- Size: 28.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a14842df470b88d5d6063e9c6dd942ede05b8b2181fcc45339ee8639b1ddb57 |
|
MD5 | f6ca37c23b6be88982cda09d675dac07 |
|
BLAKE2b-256 | 81e332398eb9b540db21f6f01374a326e0ebf47b7855971bbc950e5c743d5362 |