Skip to main content

voyage plugin for embcli

Project description

embcli-voyage

PyPI GitHub Actions Workflow Status PyPI - Python Version

voyage plugin for embcli, a command-line interface for embeddings.

Reference

Installation

pip install embcli-voyage

Quick Start

You need VoyageAI API key to use this plugin. Set VOYAGE_API_KEY environment variable in .env file in the current directory. Or you can give the env file path by -e option.

cat .env
VOYAGE_API_KEY=<YOUR_VOYAGE_KEY>

Try out the Embedding Models

# show general usage of emb command.
emb --help

# list all available models.
emb models
VoyageEmbeddingModel
    Vendor: voyage
    Models:
    * voyage-3-large (aliases: )
    * voyage-3 (aliases: )
    * voyage-3-lite (aliases: )
    * voyage-code-3 (aliases: )
    * voyage-finance-2 (aliases: )
    * voyage-law-2 (aliases: )
    * voyage-code-2 (aliases: )
    Model Options:
    * input_type (str) - Type of the input text. Options: 'None', 'query', 'document' Defaults to 'None'.
    * truncation (bool) - Whether to truncate the input texts to fit within the context length. Defaults to True.
    * output_dimension (int) - The number of dimensions for resulting output embeddings.
VoyageMultimodalEmbeddingModel
    Vendor: voyage
    Models:
    * voyage-multimodal-3 (aliases: voyage-mm-3)
    Model Options:
    * input_type (str) - Type of the input text. Options: 'None', 'query', 'document' Defaults to 'None'.
    * truncation (bool) - Whether to truncate the input texts to fit within the context length. Defaults to True.

# get an embedding for an input text by voyage-3 model.
emb embed -m voyage-3 "Embeddings are essential for semantic search and RAG apps."

# get an embedding for an input text by voyage-3 model with input_type=query.
emb embed -m voyage-3 "Embeddings are essential for semantic search and RAG apps." -o input_type query

# get an embedding for an image by voyage-multimodal-3 model.
# assume you have an image file named `gingercat.jpeg` in the current directory.
emb embed -m voyage-mm-3 --image gingercat.jpeg

# calculate similarity score between two texts by voyage-3 model. the default metric is cosine similarity.
emb simscore -m voyage-3 "The cat drifts toward sleep." "Sleep dances in the cat's eyes."
0.7499687978192594

Document Indexing and Search

You can use the emb command to index documents and perform semantic search. emb uses chroma for the default vector database.

# index example documents in the current directory.
emb ingest-sample -m voyage-3 -c catcafe --corpus cat-names-en

# or, you can give the path to your documents.
# the documents should be in a CSV file with two columns: id and text. the separator should be comma.
emb ingest -m voyage-3 -c catcafe -f <path-to-your-documents>

# search for a query in the indexed documents.
emb search -m voyage-3 -c catcafe -q "Who's the naughtiest one?"
Found 5 results:
Score: 0.41797321996324216, Document ID: 28, Text: Loki: Loki is a mischievous and clever cat, always finding new ways to entertain himself, sometimes at his humans' expense. He is a master of stealth and surprise attacks on toys. Despite his playful trickery, Loki is incredibly charming and affectionate, easily winning hearts with his roguish appeal.
Score: 0.41704222853043194, Document ID: 46, Text: Bandit: Bandit is a mischievous cat, often with mask-like markings, always on the lookout for his next playful heist of a toy or treat. He is clever and energetic, loving to chase and pounce. Despite his roguish name, Bandit is a loving companion who enjoys a good cuddle after his adventures.
Score: 0.4138587234705962, Document ID: 3, Text: Pippin (Pip): Pippin, or Pip, is a compact dynamo, brimming with mischievous charm and boundless curiosity. He’s an intrepid explorer, always finding new hideouts or investigating forbidden territories with a twinkle in his eye. Quite vocal, Pip will happily chat about his day, his playful antics making him an endearing little rascal.
Score: 0.4102669442076908, Document ID: 66, Text: Vinnie: Vinnie is a cool and confident cat, often a street-smart tabby with a lot of personality. He is resourceful and independent but also enjoys affection from his trusted humans. Vinnie is a survivor with a soft side, offering gruff purrs and head-butts, a charming rogue with a heart of gold.
Score: 0.407675485063674, Document ID: 94, Text: Xena: Xena is a warrior princess of a cat, bold, adventurous, and fiercely protective of her territory and toys. She is highly energetic and loves vigorous play, often surprising with her agility. Despite her tough exterior, Xena is deeply loyal and affectionate to her trusted human companions.

# multilingual search
emb search -m voyage-3 -c catcafe -q "一番のいたずら者は誰?"
Found 5 results:
Score: 0.3996864870685445, Document ID: 5, Text: Cosmo: Cosmo, with his wide, knowing eyes, seems to ponder the universe's mysteries. He’s an endearingly quirky character, often found investigating unusual objects or engaging in peculiar solo games. Highly intelligent and observant, Cosmo loves exploring new spaces, and his quiet, thoughtful nature makes him a fascinating and unique companion.
Score: 0.39843294400750984, Document ID: 83, Text: Monty: Monty is a charming and slightly eccentric cat, full of character and amusing quirks. He might have a favorite unusual napping spot or a peculiar way of playing. Monty is very entertaining and loves attention, often performing his unique antics for his amused human audience, a delightful and unique friend.
Score: 0.39798067438127693, Document ID: 75, Text: Elwood: Elwood is an endearingly quirky and laid-back cat, often found in amusing sleeping positions. He is friendly and easygoing, enjoying simple pleasures like a good meal and a sunny spot. Elwood is a comforting presence, always ready with a soft purr and a gentle nuzzle, a truly chill companion.
Score: 0.39575622315523173, Document ID: 24, Text: Gizmo: Gizmo is an endearingly quirky cat, full of curious habits and playful antics. He might bat at imaginary foes or carry his favorite small toy everywhere. Gizmo is incredibly entertaining and loves attention, often performing his unique tricks for his amused human audience, always bringing a smile.
Score: 0.3932879962838197, Document ID: 50, Text: Dexter: Dexter is a clever and sometimes quirky cat, always up to something interesting. He might have a fascination with running water or a particular toy he carries everywhere. Dexter is highly intelligent and enjoys interactive play, keeping his humans entertained with his unique personality and amusing antics, a truly engaging companion.

Development

See the main README for general development instructions.

Run Tests

You need to have a Voyage API key to run the tests for the embcli-voyage package. You can set it up as an environment variable:

VOYAGE_API_KEY=<YOUR_VOYAGE_KEY> RUN_VOYAGE_TESTS=1 uv run --package embcli-voyage pytest packages/embcli-voyage/tests/

Run Linter and Formatter

uv run ruff check --fix packages/embcli-voyage
uv run ruff format packages/embcli-voyage

Run Type Checker

uv run --package embcli-voyage pyright packages/embcli-voyage

Build

uv build --package embcli-voyage

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embcli_voyage-0.1.1.tar.gz (245.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

embcli_voyage-0.1.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file embcli_voyage-0.1.1.tar.gz.

File metadata

  • Download URL: embcli_voyage-0.1.1.tar.gz
  • Upload date:
  • Size: 245.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.9

File hashes

Hashes for embcli_voyage-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1d9f8ab661fa2532fbfb2c8231ed9273c67a0d56c4748a2e2f6af5e8876ae446
MD5 f8e8d6c6642a5e785d0a0b690cea93d8
BLAKE2b-256 bb343703f7e6daf17cffa27d7b99694d79d7663d873c412abf080b7c96a94a43

See more details on using hashes here.

File details

Details for the file embcli_voyage-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for embcli_voyage-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f779a0f3883160d61cb3c21016f1c1e5378383ac734dc7aaa8478f4d22335b6b
MD5 e483b17da8209d66da5b7d01239914a2
BLAKE2b-256 9d2d6b4d1a41f3e59b9b90154e7d5179d8ab1851486e91bd1c2233ef08dc9157

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page