Skip to main content

Finetuner allows one to tune the weights of any deep neural network for better embedding on search tasks.

Project description

Finetuner logo: Finetuner allows one to finetune any deep Neural Network for better embedding on search tasks. It accompanies Jina to deliver the last mile of performance-tuning for neural search applications.

Finetuning any deep neural network for better embedding on neural search tasks

Python 3.7 3.8 3.9 PyPI

Finetuner allows one to tune the weights of any deep neural network for better embedding on search tasks. It accompanies Jina to deliver the last mile of performance-tuning for neural search applications.

🎛 Designed for finetuning: a machine learning-powered human-in-the-loop tool for leveling up your pretrained models in neural search applications.

🔱 Powerful yet intuitive: all you need is finetuner.fit() - a one-liner that unlocks rich features such as siamese/triplet network, interactive labeling, layer trimming, weights freezing, dimensionality reduction.

⚛️ Framework-agnostic: promise an identical API experience on Pytorch , Keras or PaddlePaddle deep learning backends.

🧈 Jina integration: buttery smooth integration with Jina, reducing the cost of context-switch between experimenting and production.

How does it work

Install

Make sure you have Python 3.7+ and one of Pytorch (>=1.9), Tensorflow (>=2.5) or PaddlePaddle installed on Linux/MacOS.

pip install finetuner

Documentation

Usage

🪄 Usage Do you have an embedding model?
Yes No
Do you have labeled data? Yes 1️⃣ 3️⃣
No 2️⃣ 4️⃣

1️⃣ Have embedding model and labeled data

Perfect! Now embed_model and labeled_data are given by you already, simply do:

import finetuner

finetuner.fit(
    embed_model,
    train_data=labeled_data
)

2️⃣ Have embedding model and unlabeled data

You have an embed_model to use, but no labeled data for finetuning this model. No worry, that's good enough already! You can use Finetuner to interactive label data and train embed_model as below:

import finetuner

finetuner.fit(
    embed_model,
    train_data=unlabeled_data,
    interactive=True
)

3️⃣ Have general model and labeled data

You have a general_model which does not output embeddings. Luckily you provide some labeled_data for training. No worry, Finetuner can convert your model into an embedding model and train it via:

import finetuner

finetuner.fit(
    general_model,
    train_data=labeled_data,
    to_embedding_model=True,
    output_dim=100
)

4️⃣ Have general model and unlabeled data

You have a general_model which is not for embeddings. Meanwhile, you don't have labeled data for training. But no worries, Finetuner can help you train an embedding model with interactive labeling on-the-fly:

import finetuner

finetuner.fit(
    general_model,
    train_data=unlabeled_data,
    interactive=True,
    to_embedding_model=True,
    output_dim=100
)

Support

Join Us

Finetuner is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in opensource.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finetuner-0.0.3.dev64.tar.gz (42.2 kB view details)

Uploaded Source

File details

Details for the file finetuner-0.0.3.dev64.tar.gz.

File metadata

  • Download URL: finetuner-0.0.3.dev64.tar.gz
  • Upload date:
  • Size: 42.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for finetuner-0.0.3.dev64.tar.gz
Algorithm Hash digest
SHA256 5d394d1ef3d9d1efbcdea0de62cbe79317f69169b67449d19a50337792a2a0f7
MD5 de5ead9a82b647ac7b13738b40da1d1d
BLAKE2b-256 251bb9944142a3a3d3fabd652149b7a8dad0c0f51b8fcef2784d09153fbfbb94

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page