Skip to main content

emb3d.co command line inteface to work with embeddings.

Project description

emb3d

emb3d is a command-line utility that lets you generate embeddings using models from OpenAI, Cohere and HuggingFace.

Installation

pip install --upgrade emb3d

Quick Start โšก๏ธ

Install the library

pip install -U emb3d

Prepare your input file

emb3d expects a JSONL file as input. Each line of the file should be a JSON object with a text key. Example input file:

{"text": "I love my dog"}
{"text": "I love my cat"}
{"text": "I love my rabbit"}

Your files can optionally have other fields like ids, categorical labels etc.. and they are saved as-is in the final output file.

Compute embeddings

The default model is OpenAI's text-embedding-ada-002. You can change the model by passing the --model-id flag.

emb3d compute inputs.jsonl
Xnapper-2023-10-06-16 15 47

You will need to have OPENAI_API_KEY set in your environment. You can also pass it as a flag (--api_key) or set it in a config file.

emb3d config set openai_token YOUR-OPENAI-API-KEY
emb3d compute inputs.jsonl
emb3d compute inputs.jsonl --model-id embed-english-v2.0 --output-file cohere-embeddings.jsonl

For COHERE models, you will need to have COHERE_API_KEY set in your environment. You can also pass it as a flag (--api_key) or set it in a config file with: emb3d config set cohere_token YOUR-COHERE-API-KEY.

Visualize your embeddings ๐Ÿ’ฅ

The last step is to visualize your computed embeddings. This will open a browser window with a visualization of your last computed embeddings.

emb3d visualize run-2020-embeddings.jsonl

Profit ๐Ÿ’ฐ

Usage

 Usage: emb3d [OPTIONS] INPUT_FILE COMMAND [ARGS]...

 Generate embeddings for fun and profit.

โ•ญโ”€ Arguments โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ *    input_file      PATH  Path to the input file. [default: None] [required]             โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --model-id                                     TEXT     ID of the embedding model.        โ”‚
โ”‚                                                         Default is                        โ”‚
โ”‚                                                         `text-embedding-ada-002`.         โ”‚
โ”‚                                                         [default: None]                   โ”‚
โ”‚ --output-file              -out,-o             PATH     Path to the output file. If not   โ”‚
โ”‚                                                         provided, a default path will be  โ”‚
โ”‚                                                         suggested.                        โ”‚
โ”‚                                                         [default: None]                   โ”‚
โ”‚ --api-key                                      TEXT     API key for the backend. If not   โ”‚
โ”‚                                                         provided, it will be prompted or  โ”‚
โ”‚                                                         fetched from environment          โ”‚
โ”‚                                                         variables.                        โ”‚
โ”‚                                                         [default: None]                   โ”‚
โ”‚ --remote                            --local             Choose whether to do inference    โ”‚
โ”‚                                                         locally or with an API token.     โ”‚
โ”‚                                                         This choice is available for      โ”‚
โ”‚                                                         sentence transformer and hugging  โ”‚
โ”‚                                                         face models. If a model cannot be โ”‚
โ”‚                                                         run locally (ex: OpenAI models),  โ”‚
โ”‚                                                         this flag is ignored.             โ”‚
โ”‚                                                         [default: remote]                 โ”‚
โ”‚ --max-concurrent-requests                      INTEGER  (Remote Execution) Maximum number โ”‚
โ”‚                                                         of concurrent requests for the    โ”‚
โ”‚                                                         embedding task. Default is 1000.  โ”‚
โ”‚                                                         [default: 1000]                   โ”‚
โ”‚ --help                                                  Show this message and exit.       โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ Commands โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ config           Get or set a configuration value.                                        โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
Xnapper-2023-10-06-15 30 13

Need help? ๐Ÿ™‹

Join our Discord server and lets talk!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emb3d-0.1.106.tar.gz (15.6 kB view hashes)

Uploaded Source

Built Distribution

emb3d-0.1.106-py3-none-any.whl (18.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page