Unibox provides unified interface for common file operations.
Project description
unibox
unibox provides unified interface for common file operations.
Quick Start
# pip install unibox
import unibox as ub
some common use cases of unibox includes:
loading various file types in the same way:
- supports json, txt, images, parquet, csv, feather, ....
- uses appropriate best practices (such as
orjson
package for json) for speed ups
some_dict = ub.loads("some_file.json") # json → dict
some_list = ub.loads("some_file.txt") # txt → list[str]
some_img = ub.loads("some_image.jpg") # webp/jpg/png/..etc → PIL.Image
some_df = ub.loads("some_data.parquet") # parquet/csv/feather → pd.Dataframe
# .... for more: see uni_loader.py#L40
saving various python data structure in the same way:
- similar as
ub.loads
but also for saving files
ub.saves(some_dict, "some_file.json") # similar as above
ub.saves(some_df, "some_df.parquet")
list s3 or local directories in the same way:
- default optional params:
relative_unix=True, debug_print=True
- optimized
s3 ls
speed compared to boto3
files_under_dir = ub.traverses("/home/ubuntu/data") # list local file
# needs to have `aws configure` pre-configured
files_under_s3 = ub.traverses("s3://dataset-pixiv/resized_1572864") # list s3 files
simplified logger class for easier debug:
- a logger with functionalities pre-configured
- includes caller frame info, emoji warnings, datetime, and more
import unibox as ub
logger = ub.UniLogger()
def some_function():
logger.info("some info")
# logger.warning("....")
# logger.error("....")
some_function()
# 2024-05-08 17:57:23,149 [INFO] UniLogger: some_function: some info
resize millions of images efficiently:
- (pre-configured omitted here for simplicity; saves to 98% quality WEBP by default)
- also able to resize by minimum or maximum of side lengths,
# root_dir: where the images to be resized are
target_pixels = int(1024 * 1024 * 1.5)
resizer = ub.UniResizer(root_dir, dst_dir, target_pixels)
# get resize jobs
images_to_resize = resizer.get_resize_jobs()
# execute resize jobs
resizer.execute_resize_jobs(images_to_resize)
Install
install from pypi:
pip install unibox
build from source:
git clone https://github.com/trojblue/unibox
# pip install poetry
poetry install
poetry build
pip install dist/unibox-<version number>.whl
[OLD DOC] Features
The package is designed to be running with python 3.10, but targets 3.8+ for compatibility:
CLI:
unibox resize <dir>
: resizes a directory of images using eitherpillow
orlibvips
- customizable size / quality / encoding (png / webp / jpeg)
unibox copy <dir>
: an awscli-like tool for copying files with certain suffix to a new dir, keeping the same directory structure.- bypasses windows explorer so it's much faster.
unibox move <dir>
: likecopy
, but moves instead
utils:
UniLogger
: uniformed logger class (logger = unibox.UniLogger()
, and uselogger.info(...)
)UniLoader
: uniformed data loader class (unibox.loads(<filename>)
)UniSaver
: uniformed data saver class (unibox.saves(<data>, <filename>)
)UniTraverser
: uniformed directory traverser class, with callbacks in multiple stagesUniResizer
: uniformed image resizer class, with callbacks in multiple stages
callables:
unibox.traverses(dir, include, exclude, relative_unix)
: traverse a directory using specified exclude / include extensions, and return a list of filesunibox.loads(filepath)
: load arbitrary data from a file into suitable formats, with automatic detection of file type- supported formats: see UniLoader class implementation
unibox.saves(data, filepath)
: saves arbitrary data to a file, with automatic detection of file type
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unibox-0.4.8.tar.gz
(26.0 kB
view details)
Built Distribution
unibox-0.4.8-py3-none-any.whl
(31.1 kB
view details)
File details
Details for the file unibox-0.4.8.tar.gz
.
File metadata
- Download URL: unibox-0.4.8.tar.gz
- Upload date:
- Size: 26.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 919cbaaf67a4630492eae5dfe7fa710093fc025a8308dfe0c77ac968aab9e9c0 |
|
MD5 | 00235b7ea374b805529b16f1f4456116 |
|
BLAKE2b-256 | e1b13d261761827f8ceeb161eca8e7602b1acf6423c3ca355825978cb96f3474 |
File details
Details for the file unibox-0.4.8-py3-none-any.whl
.
File metadata
- Download URL: unibox-0.4.8-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5adabeca05ac2dd1f46da9e2af77dbc352aeb768cec3d0c4ccbf310dae1975fb |
|
MD5 | 271299a22ec2cd415559fc5ce7f5f2d7 |
|
BLAKE2b-256 | 04ab24867d3716d170d6f93e78b4915044a274d3d3bb2980e0a41701c6c6a762 |