Skip to main content

Pinecone client and SDK

Project description

Pinecone Python Client · License CI

The official Pinecone Python client.

For more information, see the docs at https://www.pinecone.io/docs/

Documentation

Example code

Many of the brief examples shown in this README are using very small vectors to keep the documentation concise, but most real world usage will involve much larger embedding vectors. To see some more realistic examples of how this client can be used, explore some of our many Jupyter notebooks in the examples repository.

Prerequisites

The Pinecone Python client is compatible with Python 3.8 and greater.

Installation

There are two flavors of the Pinecone python client. The default client installed from PyPI as pinecone-client has a minimal set of dependencies and interacts with Pinecone via HTTP requests.

If you are aiming to maximimize performance, you can install additional gRPC dependencies to access an alternate client implementation that relies on gRPC for data operations. See the guide on tuning performance.

Installing with pip

# Install the latest version
pip3 install pinecone-client

# Install the latest version, with extra grpc dependencies
pip3 install "pinecone-client[grpc]"

# Install a specific version
pip3 install pinecone-client==3.0.0

# Install a specific version, with grpc extras
pip3 install "pinecone-client[grpc]"==3.0.0

Installing with poetry

# Install the latest version
poetry add pinecone-client

# Install the latest version, with grpc extras
poetry add pinecone-client --extras grpc

# Install a specific version
poetry add pinecone-client==3.0.0

# Install a specific version, with grpc extras
poetry add pinecone-client==3.0.0 --extras grpc

Usage

Initializing the client

Before you can use the Pinecone SDK, you must sign up for an account and find your API key in the Pinecone console dashboard at https://app.pinecone.io.

Using environment variables

The Pinecone class is your main entry point into the Pinecone python SDK. If you have set your API Key in the PINECONE_API_KEY environment variable, you can instantiate the client with no other arguments.

from pinecone import Pinecone

pc = Pinecone() # This reads the PINECONE_API_KEY env var

Using configuration keyword params

If you prefer to pass configuration in code, for example if you have a complex application that needs to interact with multiple different Pinecone projects, the constructor accepts a keyword argument for api_key.

If you pass configuration in this way, you can have full control over what name to use for the environment variable, sidestepping any issues that would result from two different client instances both needing to read the same PINECONE_API_KEY variable that the client implicitly checks for.

Configuration passed with keyword arguments takes precedence over environment variables.

import os
from pinecone import Pinecone

pc = Pinecone(api_key=os.environ.get('CUSTOM_VAR'))

Proxy configuration

If your network setup requires you to interact with Pinecone via a proxy, you will need to pass additional configuration using optional keyword parameters. These optional parameters are forwarded to urllib3, which is the underlying library currently used by the Pinecone client to make HTTP requests. You may find it helpful to refer to the urllib3 documentation on working with proxies while troubleshooting these settings.

Here is a basic example:

from pinecone import Pinecone

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com'
)

pc.list_indexes()

If your proxy requires authentication, you can pass those values in a header dictionary using the proxy_headers parameter.

from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password')
)

pc.list_indexes()

Using proxies with self-signed certificates

By default the Pinecone Python client will perform SSL certificate verification using the CA bundle maintained by Mozilla in the certifi package.

If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate in PEM format using the ssl_ca_certs parameter.

from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
    api_key="YOUR_API_KEY",
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password'),
    ssl_ca_certs='path/to/cert-bundle.pem'
)

pc.list_indexes()

Disabling SSL verification

If you would like to disable SSL verification, you can pass the ssl_verify parameter with a value of False. We do not recommend going to production with SSL verification disabled.

from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password'),
    ssl_ca_certs='path/to/cert-bundle.pem',
    ssl_verify=False
)

pc.list_indexes()

Working with GRPC (for improved performance)

If you've followed instructions above to install with optional grpc extras, you can unlock some performance improvements by working with an alternative version of the client imported from the pinecone.grpc subpackage.

import os
from pinecone.grpc import PineconeGRPC

pc = PineconeGRPC(api_key=os.environ.get('PINECONE_API_KEY'))

# From here on, everything is identical to the REST-based client.
index = pc.Index(host='my-index-8833ca1.svc.us-east1-gcp.pinecone.io')

index.upsert(vectors=[])
index.query(vector=[...], top_key=10)

Indexes

Create Index

Create a serverless index

[!WARNING]
Serverless indexes are in public preview and are available only on AWS in the us-west-2 region. Check the current limitations and test thoroughly before using it in production.

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
pc.create_index(
    name='my-index',
    dimension=1536,
    metric='euclidean',
    spec=ServerlessSpec(
        cloud='aws',
        region='us-west-2'
    )
)

Create a pod index

The following example creates an index without a metadata configuration. By default, Pinecone indexes all metadata.

from pinecone import Pinecone, PodSpec

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
pc.create_index(
    name="example-index",
    dimension=1536,
    metric="cosine",
    spec=PodSpec(
        environment='us-west-2',
        pod_type='p1.x1'
    )
)

Pod indexes support many optional configuration fields. For example, the following example creates an index that only indexes the "color" metadata field. Queries against this index cannot filter based on any other metadata field.

from pinecone import Pinecone, PodSpec

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

metadata_config = {
    "indexed": ["color"]
}

pc.create_index(
    "example-index-2",
    dimension=1536,
    spec=PodSpec(
        environment='us-west-2',
        pod_type='p1.x1',
        metadata_config=metadata_config
    )
)

List indexes

The following example returns all indexes in your project.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
for index in pc.list_indexes():
    print(index['name'])

Describe index

The following example returns information about the index example-index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

index_description = pc.describe_index("example-index")

Delete an index

The following example deletes the index named example-index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.delete_index("example-index")

Scale replicas

The following example changes the number of replicas for example-index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

new_number_of_replicas = 4
pc.configure_index("example-index", replicas=new_number_of_replicas)

Describe index statistics

The following example returns statistics about the index example-index.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
index = pc.Index(host=os.environ.get('INDEX_HOST'))

index_stats_response = index.describe_index_stats()

Upsert vectors

The following example upserts vectors to example-index.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
index = pc.Index(host=os.environ.get('INDEX_HOST'))

upsert_response = index.upsert(
    vectors=[
        ("vec1", [0.1, 0.2, 0.3, 0.4], {"genre": "drama"}),
        ("vec2", [0.2, 0.3, 0.4, 0.5], {"genre": "action"}),
    ],
    namespace="example-namespace"
)

Query an index

The following example queries the index example-index with metadata filtering.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

query_response = index.query(
    namespace="example-namespace",
    vector=[0.1, 0.2, 0.3, 0.4],
    top_k=10,
    include_values=True,
    include_metadata=True,
    filter={
        "genre": {"$in": ["comedy", "documentary", "drama"]}
    }
)

Delete vectors

The following example deletes vectors by ID.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

delete_response = index.delete(ids=["vec1", "vec2"], namespace="example-namespace")

Fetch vectors

The following example fetches vectors by ID.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

fetch_response = index.fetch(ids=["vec1", "vec2"], namespace="example-namespace")

Update vectors

The following example updates vectors by ID.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

update_response = index.update(
    id="vec1",
    values=[0.1, 0.2, 0.3, 0.4],
    set_metadata={"genre": "drama"},
    namespace="example-namespace"
)

List vectors

The list and list_paginated methods can be used to list vector ids matching a particular id prefix. With clever assignment of vector ids, this can be used to help model hierarchical relationships between different vectors such as when there are embeddings for multiple chunks or fragments related to the same document.

The list method returns a generator that handles pagination on your behalf.

from pinecone import Pinecone

pc = Pinecone(api_key='xxx')
index = pc.Index(host='hosturl')

# To iterate over all result pages using a generator function
namespace = 'foo-namespace'
for ids in index.list(prefix='pref', limit=3, namespace=namespace):
    print(ids) # ['pref1', 'pref2', 'pref3']

    # Now you can pass this id array to other methods, such as fetch or delete.
    vectors = index.fetch(ids=ids, namespace=namespace)

There is also an option to fetch each page of results yourself with list_paginated.

from pinecone import Pinecone

pc = Pinecone(api_key='xxx')
index = pc.Index(host='hosturl')

# For manual control over pagination
results = index.list_paginated(
    prefix='pref',
    limit=3,
    namespace='foo',
    pagination_token='eyJza2lwX3Bhc3QiOiI5IiwicHJlZml4IjpudWxsfQ=='
)
print(results.namespace) # 'foo'
print([v.id for v in results.vectors]) # ['pref1', 'pref2', 'pref3']
print(results.pagination.next) # 'eyJza2lwX3Bhc3QiOiI5IiwicHJlZml4IjpudWxsfQ=='
print(results.usage) # { 'read_units': 1 }

Collections

Create collection

The following example creates the collection example-collection from example-index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.create_collection(
    name="example-collection",
    source="example-index"
)

List collections

The following example returns a list of the collections in the current project.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

active_collections = pc.list_collections()

Describe a collection

The following example returns a description of the collection example-collection.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

collection_description = pc.describe_collection("example-collection")

Delete a collection

The following example deletes the collection example-collection.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.delete_collection("example-collection")

Inference

Interact with Pinecone's Inference APIs, e.g. create embeddings (currently in preview).

Models currently supported:

Create embeddings

The following example highlights how to use an embedding model to generate embeddings for a list of documents and a user query, with the ultimate goal of retrieving similar documents from a Pinecone index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
model = "multilingual-e5-large"

# Embed documents
text = ["Turkey is a classic meat to eat at American Thanksgiving.",
        "Many people enjoy the beautiful mosques in Turkey."]
text_embeddings = pc.embeddings.create(model=model,
                                       inputs=text,
                                       parameters={"input_type": "context", "truncate": "END"}, )

# <<Upsert documents into Pinecone index>>

# Embed query
query = ["How should I prepare my turkey?"]
query_embeddings = pc.embeddings.create(model=model,
                                        inputs=query,
                                        parameters={"input_type": "query", "truncate": "END"}, )

# <<Send query to Pinecone index to retrieve similar documents>>

Contributing

If you'd like to make a contribution, or get setup locally to develop the Pinecone python client, please see our contributing guide

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unicorn_donkey-3.2.2.dev20240510222739.tar.gz (118.7 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file unicorn_donkey-3.2.2.dev20240510222739.tar.gz.

File metadata

File hashes

Hashes for unicorn_donkey-3.2.2.dev20240510222739.tar.gz
Algorithm Hash digest
SHA256 303092392efa9e4ed3f5c9faf205577d91eedce2d330c7bcb00b385ba45c0992
MD5 36846f565d89053ff4c44d22922bf1a0
BLAKE2b-256 9e8c5cbf28923c9a99fe885fe4cee018d0dc4d0ce294bb573c2bba69473929f3

See more details on using hashes here.

File details

Details for the file unicorn_donkey-3.2.2.dev20240510222739-py3-none-any.whl.

File metadata

File hashes

Hashes for unicorn_donkey-3.2.2.dev20240510222739-py3-none-any.whl
Algorithm Hash digest
SHA256 4f37846de44d05e45824f73cca67554dded144f8c28aea50b8ee67516ad2478d
MD5 4ecdb247ecff3bf5a479a1dcbe0885ab
BLAKE2b-256 b7d6dfc89a43a69e72abc9dd40a4abf2c94ffab0d54eadf73a6a4af9dae42a16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page