towhee

Towhee is a framework that helps you encode your unstructured data into embeddings.

Project description

x2vec, Towhee is all you need!

ENGLISH | 中文文档

Towhee makes it easy to build neural data processing pipelines for AI applications. We provide hundreds of models, algorithms, and transformations that can be used as standard pipeline building blocks. You can use Towhee's Pythonic API to build a prototype of your pipeline and automatically optimize it for production-ready environments.

:art: Various Modalities: Towhee supports data processing on a variety of modalities, including images, videos, text, audio, molecular structures, etc.

:mortar_board: SOTA Models: Towhee provides SOTA models across 5 fields (CV, NLP, Multimodal, Audio, Medical), 15 tasks, and 140+ model architectures. These include BERT, CLIP, ViT, SwinTransformer, MAE, and data2vec, all pretrained and ready to use.

:package: Data Processing: Towhee also provides traditional methods alongside neural network models to help you build practical data processing pipelines. We have a rich pool of operators available, such as video decoding, audio slicing, frame sampling, feature vector dimension reduction, ensembling, and database operations.

:snake: Pythonic API: Towhee includes a Pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, which makes processing unstructured data as easy as handling tabular data.

What's New

v1.0.0rc1 May. 4, 2023

Add trainer to operators: timm, isc, transformers, clip
Add GPU video decoder: VPF
All towhee pipelines can be converted into Nvidia Triton services.

v0.9.0 Dec. 2, 2022

Added one video classification model: Vis4mer
Added three visual backbones: MCProp, RepLKNet, Shunted Transformer
Add two code search operators: code_search.codebert, code_search.unixcoder
Add five image captioning operators: image_captioning.expansionnet-v2, image_captioning.magic, image_captioning.clip_caption_reward, image_captioning.blip, image_captioning.clipcap
Add five image-text embedding operators: image_text_embedding.albef, image_text_embedding.ru_clip, image_text_embedding.japanese_clip, image_text_embedding.taiyi, image_text_embedding.slip
Add one machine-translation operator: machine_translation.opus_mt
Add one filter-tiny-segments operator: video-copy-detection.filter-tiny-segments
Add an advanced tutorial for audio fingerprinting: Audio Fingerprint II: Music Detection with Temporal Localization (increased accuracy from 84% to 90%)

v0.8.1 Sep. 30, 2022

Added four visual backbones: ISC, MetaFormer, ConvNext, HorNet
Add two video de-copy operators: select-video, temporal-network
Add one image embedding operator specifically designed for image retrieval and video de-copy with SOTA performance on VCSL dataset: isc
Add one audio embedding operator specified for audio fingerprint: audio_embedding.nnfp (with pretrained weights)
Add one tutorial for video de-copy: How to Build a Video Segment Copy Detection System
Add one beginner tutorial for audio fingerprint: Audio Fingerprint I: Build a Demo with Towhee & Milvus

v0.8.0 Aug. 16, 2022

Towhee now supports generating an Nvidia Triton Server from a Towhee pipeline, with aditional support for GPU image decoding.
Added one audio fingerprinting model: nnfp
Added two image embedding models: RepMLP, WaveViT

v0.7.3 Jul. 27, 2022

Added one multimodal (text/image) model: CoCa.
Added two video models for grounded situation recognition & repetitive action counting: CoFormer, TransRAC.
Added two SoTA models for image tasks (image retrieval, image classification, etc.): CVNet, MaxViT

v0.7.1 Jul. 1, 2022

Added one image embedding model: MPViT.
Added two video retrieval models: BridgeFormer, collaborative-experts.
Added FAISS-based ANNSearch operators: to_faiss, faiss_search.

v0.7.0 Jun. 24, 2022

Added six video understanding/classification models: Video Swin Transformer, TSM, Uniformer, OMNIVORE, TimeSformer, MoViNets.
Added four video retrieval models: CLIP4Clip, DRL, Frozen in Time, MDMMT.

v0.6.1 May. 13, 2022

Added three text-image retrieval models: CLIP, BLIP, LightningDOT.
Added six video understanding/classification models from PyTorchVideo: I3D, C2D, Slow, SlowFast, X3D, MViT.

Getting started

Towhee requires Python 3.6+. You can install Towhee via pip:

pip install towhee towhee.models

If you run into any pip-related install problems, please try to upgrade pip with pip install -U pip.

Let's try your first Towhee pipeline. Below is an example for how to create a CLIP-based cross modal retrieval pipeline.

The example needs towhee 1.0.0, which can be installed with pip install towhee==1.0.0, The latest usage documentation.

from glob import glob
from towhee import ops, pipe, DataCollection


# create image embeddings and build index
p = (
    pipe.input('file_name')
    .map('file_name', 'img', ops.image_decode.cv2())
    .map('img', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch32', modality='image'))
    .map('vec', 'vec', ops.towhee.np_normalize())
    .map(('vec', 'file_name'), (), ops.ann_insert.faiss_index('./faiss', 512))
    .output()
)

for f_name in ['https://raw.githubusercontent.com/towhee-io/towhee/main/assets/dog1.png',
               'https://raw.githubusercontent.com/towhee-io/towhee/main/assets/dog2.png',
               'https://raw.githubusercontent.com/towhee-io/towhee/main/assets/dog3.png']:
    p(f_name)

# Delete the pipeline object, make sure the faiss data is written to disk. 
del p


# search image by text
decode = ops.image_decode.cv2('rgb')
p = (
    pipe.input('text')
    .map('text', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch32', modality='text'))
    .map('vec', 'vec', ops.towhee.np_normalize())
    # faiss op result format:  [[id, score, [file_name], ...]
    .map('vec', 'row', ops.ann_search.faiss_index('./faiss', 3))
    .map('row', 'images', lambda x: [decode(item[2][0]) for item in x])
    .output('text', 'images')
)

DataCollection(p('a cat')).show()

Learn more examples from the Towhee Examples.

Core Concepts

Towhee is composed of four main building blocks - Operators, Pipelines, DataCollection API and Engine.

Operators: An operator is a single building block of a neural data processing pipeline. Different implementations of operators are categorized by tasks, with each task having a standard interface. An operator can be a deep learning model, a data processing method, or a Python function.
Pipelines: A pipeline is composed of several operators interconnected in the form of a DAG (directed acyclic graph). This DAG can direct complex functionalities, such as embedding feature extraction, data tagging, and cross modal data analysis.
DataCollection API: A Pythonic and method-chaining style API for building custom pipelines. A pipeline defined by the DataColltion API can be run locally on a laptop for fast prototyping and then be converted to a docker image, with end-to-end optimizations, for production-ready environments.
Engine: The engine sits at Towhee's core. Given a pipeline, the engine will drive dataflow among individual operators, schedule tasks, and monitor compute resource usage (CPU/GPU/etc). We provide a basic engine within Towhee to run pipelines on a single-instance machine and a Triton-based engine for docker containers.

Contributing

Writing code is not the only way to contribute! Submitting issues, answering questions, and improving documentation are just some of the many ways you can help our growing community. Check out our contributing page for more information.

Special thanks goes to these folks for contributing to Towhee, either on Github, our Towhee Hub, or elsewhere:

Looking for a database to store and index your embedding vectors? Check out Milvus.

Project details

Release history Release notifications | RSS feed

This version

1.1.3

Dec 4, 2023

1.1.2

Sep 14, 2023

1.1.1

Jul 5, 2023

1.1.0

Jun 9, 2023

1.0.0

May 25, 2023

1.0.0rc1 pre-release

May 4, 2023

0.9.0

Dec 2, 2022

0.8.1

Sep 30, 2022

0.8.0

Aug 16, 2022

0.7.3

Jul 27, 2022

0.7.2

Jul 7, 2022

0.7.1

Jul 1, 2022

0.7.0

Jun 24, 2022

0.6.1

May 13, 2022

0.6.0

Apr 8, 2022

0.5.1

Mar 30, 2022

0.5.0

Mar 4, 2022

0.4.0

Dec 31, 2021

0.4.0rc2 pre-release

Dec 30, 2021

0.4.0rc1 pre-release

Dec 27, 2021

0.3.0

Dec 9, 2021

0.2.0

Oct 28, 2021

0.1.3

Sep 30, 2021

0.1.2

Sep 30, 2021

0.1.1

Sep 30, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

towhee-1.1.3-py3-none-any.whl (222.2 kB view details)

Uploaded Dec 4, 2023 Python 3

File details

Details for the file towhee-1.1.3-py3-none-any.whl.

File metadata

Download URL: towhee-1.1.3-py3-none-any.whl
Upload date: Dec 4, 2023
Size: 222.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for towhee-1.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e71ce3975a479b848d21561e15ef3943432cf02537135083752a3065fdbc329`
MD5	`6579c48913d28fa193dadb6a079c3534`
BLAKE2b-256	`be04d2c327956fd15a3b7f54f75527142d05fe061c3a411556a53f74ac8c87c5`

See more details on using hashes here.

towhee 1.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

x2vec, Towhee is all you need!

ENGLISH | 中文文档

What's New

Getting started

Core Concepts

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes