Unibox provides unified interface for common file operations.
Project description
unibox
unibox provides unified interface for common file operations.
Quick Start
# pip install unibox
import unibox as ub
some common use cases of unibox includes:
loading various file types in the same way:
- supports json, txt, images, parquet, csv, feather, ....
- uses appropriate best practices (such as
orjson
package for json) for speed ups
some_dict = ub.loads("some_file.json") # json → dict
some_list = ub.loads("some_file.txt") # txt → list[str]
some_img = ub.loads("some_image.jpg") # webp/jpg/png/..etc → PIL.Image
some_df = ub.loads("some_data.parquet") # parquet/csv/feather → pd.Dataframe
# .... for more: see uni_loader.py#L40
saving various python data structure in the same way:
- similar as
ub.loads
but also for saving files
# mostly similar as above
ub.saves(some_dict, "some_file.json")
ub.saves(some_df, "some_df.parquet")
list s3 or local directories in the same way:
- default optional params:
relative_unix=True, debug_print=True
- optimized
s3 ls
speed compared to boto3
files_under_dir = ub.traverses("/home/ubuntu/data") # list local file
# needs to have `aws configure` pre-configured
files_under_s3 = ub.traverses("s3://dataset-pixiv/resized_1572864") # list s3 files
simplified logger class for easier debug:
- a logger with functionalities pre-configured
- includes caller frame info, emoji warnings, datetime, and more
logger = ub.UniLogger()
logger.info("....")
logger.warn("....")
logger.error("....")
resizing millions of images efficiently:
- (pre-configured omitted here for simplicity)
- also able to resize by minimum or maximum of side lengths,
# Initialize resizer
resizer = ub.UniResizer(root_dir, dst_dir,
target_pixels=int(1024 * 1024 * 1.5),
)
# Resize the images
images_to_resize = resizer.get_resize_jobs()
resizer.execute_resize_jobs(images_to_resize)
Install
install from pypi:
pip install unibox
build from source:
git clone https://github.com/trojblue/unibox
# pip install poetry
poetry install
poetry build
pip install dist/unibox-<version number>.whl
[OLD DOC] Features
The package is designed to be running with python 3.10, but targets 3.8+ for compatibility:
CLI:
unibox resize <dir>
: resizes a directory of images using eitherpillow
orlibvips
- customizable size / quality / encoding (png / webp / jpeg)
unibox copy <dir>
: an awscli-like tool for copying files with certain suffix to a new dir, keeping the same directory structure.- bypasses windows explorer so it's much faster.
unibox move <dir>
: likecopy
, but moves instead
utils:
UniLogger
: uniformed logger class (logger = unibox.UniLogger()
, and uselogger.info(...)
)UniLoader
: uniformed data loader class (unibox.loads(<filename>)
)UniSaver
: uniformed data saver class (unibox.saves(<data>, <filename>)
)UniTraverser
: uniformed directory traverser class, with callbacks in multiple stagesUniResizer
: uniformed image resizer class, with callbacks in multiple stages
callables:
unibox.traverses(dir, include, exclude, relative_unix)
: traverse a directory using specified exclude / include extensions, and return a list of filesunibox.loads(filepath)
: load arbitrary data from a file into suitable formats, with automatic detection of file type- supported formats: see UniLoader class implementation
unibox.saves(data, filepath)
: saves arbitrary data to a file, with automatic detection of file type
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unibox-0.4.0.tar.gz
(22.9 kB
view details)
Built Distribution
unibox-0.4.0-py3-none-any.whl
(27.5 kB
view details)
File details
Details for the file unibox-0.4.0.tar.gz
.
File metadata
- Download URL: unibox-0.4.0.tar.gz
- Upload date:
- Size: 22.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b304cecd6bbe062f3c38a884b7525236dd5e23b6715ead75bab27425e7b371c9 |
|
MD5 | 25c33413c1407e0a573528ef4c9caf25 |
|
BLAKE2b-256 | 232b5f844033c6427f93aeee3279623afdf5f3a6fa808fbd64ae55b6f330ae09 |
File details
Details for the file unibox-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: unibox-0.4.0-py3-none-any.whl
- Upload date:
- Size: 27.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10187459bcc7642c4a7098d4034c47cadaa08640eafdf822e7899928769baef4 |
|
MD5 | 6877434a7a417d0e3322c4db0aa8c40a |
|
BLAKE2b-256 | c00afdf1d62323c9c88f999e6302398163ab75a7db6108f0e588a2770b21a426 |