Task-oriented finetuning for better embeddings on neural search.
Project description
Task-oriented finetuning for better embeddings on neural search
Fine-tuning is an effective way to improve the performance on neural search tasks. However, it is non-trivial for many deep learning engineers.
Finetuner makes fine-tuning easier, faster and performant by streamlining the workflow and handling all complexity and infrastructure on the cloud. With Finetuner, one can easily uplift pre-trained models to be more performant and production ready.
📈 Performance promise: uplift pretrained model and deliver SOTA performance on domain-specific neural search applications.
🔱 Simple yet powerful: easy access to 40+ mainstream losses, 10+ optimisers, layer pruning, weights freezing, dimensionality reduction, hard-negative mining, cross-modal model, distributed training.
☁ All-in-cloud: instant training with our free GPU; manage runs, experiments and artifacts on Jina AI Cloud without worrying about provisioning resources, integration complexity and infrastructure.
Documentation
Benchmark
Model | Task | Metric | Pretrained | Finetuned | Delta | Run it! |
---|---|---|---|---|---|---|
BERT | Quora Question Answering | mRR | 0.835 | 0.967 | 15.8% | |
Recall | 0.915 | 0.963 | 5.3% | |||
ResNet | Visual similarity search on TLL | mAP | 0.102 | 0.166 | 62.7% | |
Recall | 0.235 | 0.372 | 58.3% | |||
CLIP | Deep Fashion text-to-image search | mRR | 0.575 | 0.676 | 17.4% | |
Recall | 0.473 | 0.564 | 19.2% |
All metrics are evaluated on k@20 after training for 5 epochs using Adam optimizer with learning rates of 1e-7 for CLIP and 1e-5 for the other models.
Install
Make sure you have Python 3.7+ installed. Finetuner can be installed via pip by executing:
pip install -U finetuner
If you want to encode docarray.DocumentArray
objects with the finetuner.encode
function, you need to install "finetuner[full]"
.
In this case, some extra dependencies are installed which are necessary to do the inference, e.g., torch, torchvision, and open clip:
pip install "finetuner[full]"
From 0.5.0, Finetuner computing is hosted on Jina AI Cloud. THe last local version is
0.4.1
, one can install it via pip or check out git tags/releases here.
Get Started
The following code snippet describes how to fine-tune ResNet50 on Totally Looks Like dataset, it can be run as-is:
import finetuner
from finetuner.callback import EvaluationCallback
finetuner.login() # use finetuner.notebook_login() in Jupyter notebook/Google Colab
run = finetuner.fit(
model='resnet50',
run_name='resnet50-tll-run',
train_data='tll-train-da',
callbacks=[
EvaluationCallback(
query_data='tll-test-query-da',
index_data='tll-test-index-da',
)
],
)
Fine-tuning might take 5 minute to finish. You can later re-connect your run with:
import finetuner
finetuner.login() # use finetuner.notebook_login() in Jupyter notebook or Google colab
run = finetuner.get_run('resnet50-tll-run')
for msg in run.stream_logs():
print(msg)
run.save_artifact('resnet-tll')
Specifically, the code snippet describes the following steps:
- Login to Jina AI Cloud.
- Select backbone model, training and evaluation data for your evaluation callback.
- Start the cloud run.
- Monitor the status: check the status and logs of the run.
- Save model for further use and integration.
Finally, you can use the model to encode images:
import finetuner
from docarray import Document, DocumentArray
da = DocumentArray([Document(uri='~/Pictures/your_img.png')])
model = finetuner.get_model('resnet-tll')
finetuner.encode(model=model, data=da)
da.summary()
Next steps
- Take the walkthrough and submit your first fine-tuning job.
- Try out different search tasks:
Intrigued? That's only scratching the surface of what Finetuner is capable of. Read our docs to learn more.
Support
- Use Discussions to talk about your use cases, questions, and support queries.
- Join our Slack community and chat with other Jina AI community members about ideas.
- Join our Engineering All Hands meet-up to discuss your use case and learn Jina AI new features.
- When? The second Tuesday of every month
- Where? Zoom (see our public events calendar/.ical) and live stream on YouTube
- Subscribe to the latest video tutorials on our YouTube channel
Join Us
Finetuner is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in opensource.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.