Task-oriented finetuning for better embeddings on neural search.
Project description
Task-oriented finetuning for better embeddings on neural search
Fine-tuning is an effective way to improve the performance on neural search tasks. However, it is non-trivial for many deep learning engineers.
Finetuner makes fine-tuning easier, faster and performant by streamlining the workflow and handling all complexity and infrastructure on the cloud. With Finetuner, one can easily uplift pre-trained models to be more performant and production ready.
📈 Performance promise: uplift pretrained model and deliver SOTA performance on domain-specific neural search applications.
🔱 Simple yet powerful: easy access to 40+ mainstream losses, 10+ optimisers, layer pruning, weights freezing, dimensionality reduction, hard-negative mining, cross-modal model, distributed training.
☁ All-in-cloud: instant training with our free GPU (Apply here for free!); manage runs, experiments and artifacts on Jina Cloud without worrying about provisioning resources, integration complexity and infrastructure.
Documentation
Benchmark
| Model | Task | Metric | Pretrained | Finetuned | Delta |
|---|---|---|---|---|---|
| BERT | Quora Question Answering | mRR | 0.835 | 0.967 | :arrow_up_small: 15.8% |
| Recall | 0.915 | 0.963 | :arrow_up_small: 5.3% | ||
| ResNet | Visual similarity search on TLL | mAP | 0.102 | 0.166 | :arrow_up_small: 62.7% |
| Recall | 0.235 | 0.372 | :arrow_up_small: 58.3% | ||
| CLIP | Deep Fashion text-to-image search | mRR | 0.575 | 0.676 | :arrow_up_small: 17.4% |
| Recall | 0.473 | 0.564 | :arrow_up_small: 19.2% |
All metrics are evaluated on k@20 after training for 5 epochs using Adam optimizer with learning rates of 1e-7 for CLIP and 1e-5 for the other models.
Install
Make sure you have Python 3.7+ installed. Finetuner can be installed via pip by executing:
pip install -U finetuner
If you want to encode docarray.DocumentArray objects with the finetuner.encode function, you need to install "finetuner[full]".
In this case, some extra dependencies are installed which are necessary to do the inference, e.g., torch, torchvision, and open clip:
pip install "finetuner[full]"
From 0.5.0, Finetuner computing is hosted on Jina Cloud. THe last local version is
0.4.1, one can install it via pip or check out git tags/releases here.
Get Started
The following code snippet describes how to fine-tune ResNet50 on Totally Looks Like dataset, it can be run as-is:
import finetuner
from finetuner.callback import EvaluationCallback
finetuner.login()
run = finetuner.fit(
model='resnet50',
run_name='resnet50-tll-run',
train_data='tll-train-da',
callbacks=[
EvaluationCallback(
query_data='tll-test-query-da',
index_data='tll-test-index-da',
)
],
)
Fine-tuning might take 5 minute to finish. You can later re-connect your run with:
import finetuner
finetuner.login()
run = finetuner.get_run('resnet50-tll-run')
for msg in run.stream_logs():
print(msg)
run.save_artifact('resnet-tll')
Specifically, the code snippet describes the following steps:
- Login to Finetuner (Get free access here!)
- Select backbone model, training and evaluation data for your evaluation callback.
- Start the cloud run.
- Monitor the status: check the status and logs of the run.
- Save model for further use and integration.
Finally, you can use the model to encode images:
import finetuner
from docarray import Document, DocumentArray
da = DocumentArray([Document(uri='~/Pictures/your_img.png')])
model = finetuner.get_model('resnet-tll')
finetuner.encode(model=model, data=da)
da.summary()
Next steps
- Take the walkthrough and submit your first fine-tuning job.
- Try on different search tasks:
Intrigued? That's only scratching the surface of what Finetuner is capable of. Read our docs to learn more.
Support
- Use Discussions to talk about your use cases, questions, and support queries.
- Join our Slack community and chat with other Jina AI community members about ideas.
- Join our Engineering All Hands meet-up to discuss your use case and learn Jina AI new features.
- When? The second Tuesday of every month
- Where? Zoom (see our public events calendar/.ical) and live stream on YouTube
- Subscribe to the latest video tutorials on our YouTube channel
Join Us
Finetuner is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in opensource.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file finetuner-0.6.3.tar.gz.
File metadata
- Download URL: finetuner-0.6.3.tar.gz
- Upload date:
- Size: 27.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f80266957c8d0e73afd33081737161b519ce232bd82e12a499d1f0140963dd58
|
|
| MD5 |
0c85d3ff0f46aa5b75a6b93ed0bf07ef
|
|
| BLAKE2b-256 |
7aa06f380a366138146e388b299b262bfdf7727d27d16446605d8c68d92c28ba
|