Task-oriented finetuning for better embeddings on neural search.
Project description
Task-oriented finetuning for better embeddings on neural search
Fine-tuning is an effective way to improve the performance on neural search tasks. However, it is non-trivial for many deep learning engineers.
Finetuner makes fine-tuning easier, faster and performant by streamlining the workflow and handling all complexity and infrastructure on the cloud. With Finetuner, one can easily uplift pre-trained models to be more performant and production ready.
📈 Performance promise: uplift pretrained model and deliver SOTA performance on domain-specific neural search applications.
🔱 Simple yet powerful: easy access to 40+ mainstream losses, 10+ optimisers, layer pruning, weights freezing, dimensionality reduction, hard-negative mining, cross-modal model, distributed training.
☁ All-in-cloud: instant training with our free GPU (Apply here for free!); manage runs, experiments and artifacts on Jina Cloud without worrying about provisioning resources, integration complexity and infrastructure.
Documentation
Benchmark
Model | Task | Metric | Pretrained | Finetuned | Delta |
---|---|---|---|---|---|
BERT | Quora Question Answering | mRR | 0.835 | 0.967 | :arrow_up_small: 15.8% |
Recall | 0.915 | 0.963 | :arrow_up_small: 5.3% | ||
ResNet | Visual similarity search on TLL | mAP | 0.102 | 0.166 | :arrow_up_small: 62.7% |
Recall | 0.235 | 0.372 | :arrow_up_small: 58.3% | ||
CLIP | Deep Fashion text-to-image search | mRR | 0.289 | 0.488 | :arrow_up_small: 69.9% |
Recall | 0.109 | 0.346 | :arrow_up_small: 217.0% |
[*] All metrics evaluation on k@20, trained 5 epochs using Adam optimizer with learning rate of 1e-5.
Install
Make sure you have Python 3.7+ installed. Finetuner can be installed via pip by executing:
pip install -U finetuner
If you want to encode docarray.DocumentArray
objects with the finetuner.encode
function, you need to install "finetuner[full]"
.
In this case, some extra dependencies are installed which are necessary to do the inference, e.g., torch, torchvision, and open clip:
pip install "finetuner[full]"
From 0.5.0, Finetuner computing is hosted on Jina Cloud. THe last local version is
0.4.1
, one can install it via pip or check out git tags/releases here.
Get Started
The following code snippet describes how to fine-tune ResNet50 on Totally Looks Like dataset, it can be run as-is:
import finetuner
from finetuner.callback import EvaluationCallback
finetuner.login()
run = finetuner.fit(
model='resnet50',
run_name='resnet50-tll-run',
train_data='tll-train-da',
callbacks=[
EvaluationCallback(
query_data='tll-test-query-da',
index_data='tll-test-index-da',
)
],
)
Fine-tuning might take 5 minute to finish. You can later re-connect your run with:
import finetuner
finetuner.login()
run = finetuner.get_run('resnet50-tll-run')
for msg in run.stream_logs():
print(msg)
run.save_artifact('resnet-tll')
Specifically, the code snippet describes the following steps:
- Login to Finetuner (Get free access here!)
- Select backbone model, training and evaluation data for your evaluation callback.
- Start the cloud run.
- Monitor the status: check the status and logs of the run.
- Save model for further use and integration.
Finally, you can use the model to encode images:
import finetuner
from docarray import Document, DocumentArray
da = DocumentArray([Document(uri='~/Pictures/your_img.png')])
model = finetuner.get_model('resnet-tll')
finetuner.encode(model=model, data=da)
da.summary()
Next steps
- Take the walkthrough and submit your first fine-tuning job.
- Try on different search tasks:
Intrigued? That's only scratching the surface of what Finetuner is capable of. Read our docs to learn more.
Support
- Use Discussions to talk about your use cases, questions, and support queries.
- Join our Slack community and chat with other Jina AI community members about ideas.
- Join our Engineering All Hands meet-up to discuss your use case and learn Jina AI new features.
- When? The second Tuesday of every month
- Where? Zoom (see our public events calendar/.ical) and live stream on YouTube
- Subscribe to the latest video tutorials on our YouTube channel
Join Us
Finetuner is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in opensource.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.