Skip to main content

Azure Cosmos Python SDK

Project description

🚨This SDK is now maintained at https://github.com/Azure/azure-sdk-for-python 🚨

More information: https://github.com/Azure/azure-cosmos-python/issues/197

Azure Cosmos DB SQL API client library for Python

Azure Cosmos DB is a globally distributed, multi-model database service that supports document, key-value, wide-column, and graph databases.

Use the Azure Cosmos DB SQL API SDK for Python to manage databases and the JSON documents they contain in this NoSQL database service.

  • Create Cosmos DB databases and modify their settings
  • Create and modify containers to store collections of JSON documents
  • Create, read, update, and delete the items (JSON documents) in your containers
  • Query the documents in your database using SQL-like syntax

Looking for source code or API reference?

Please see the latest version of the Azure Cosmos DB Python SDK for SQL API

Getting started

If you need a Cosmos DB SQL API account, you can create one with this Azure CLI command:

az cosmosdb create --resource-group <resource-group-name> --name <cosmos-account-name>

Installation

pip install azure-cosmos

Configure a virtual environment (optional)

Although not required, you can keep your your base system and Azure SDK environments isolated from one another if you use a virtual environment. Execute the following commands to configure and then enter a virtual environment with venv:

python3 -m venv azure-cosmosdb-sdk-environment
source azure-cosmosdb-sdk-environment/bin/activate

Key concepts

Interaction with Cosmos DB starts with an instance of the CosmosClient class. You need an account, its URI, and one of its account keys to instantiate the client object.

Get credentials

Use the Azure CLI snippet below to populate two environment variables with the database account URI and its primary master key (you can also find these values in the Azure portal). The snippet is formatted for the Bash shell.

RES_GROUP=<resource-group-name>
ACCT_NAME=<cosmos-db-account-name>

export ACCOUNT_URI=$(az cosmosdb show --resource-group $RES_GROUP --name $ACCT_NAME --query documentEndpoint --output tsv)
export ACCOUNT_KEY=$(az cosmosdb keys list --resource-group $RES_GROUP --name $ACCT_NAME --query primaryMasterKey --output tsv)

Create client

Once you've populated the ACCOUNT_URI and ACCOUNT_KEY environment variables, you can create the CosmosClient.

import azure.cosmos.cosmos_client as cosmos_client
import azure.cosmos.errors as errors
import azure.cosmos.http_constants as http_constants

import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = cosmos_client.CosmosClient(url, {'masterKey': key})

Usage

When you create a Cosmos DB Database Account, you specify the API you'd like to use when interacting with its documents: SQL, MongoDB, Gremlin, Cassandra, or Azure Table.

This SDK is used to interact with an SQL API database account.

Once you've initialized a CosmosClient, you can interact with the primary resource types in Cosmos DB:

  • Database: A Cosmos DB account can contain multiple databases. A database may contain a number of containers.

  • Container: A container is a collection of JSON documents. You create (insert), read, update, and delete items in a container.

  • Item: An Item is the dictionary-like representation of a JSON document stored in a container. Each Item you add to a container must include an id key with a value that uniquely identifies the item within the container.

For more information about these resources, see Working with Azure Cosmos databases, containers and items.

Examples

The following sections provide several code snippets covering some of the most common Cosmos DB tasks, including:

Create a database

After authenticating your CosmosClient, you can work with any resource in the account. You can use CosmosClient.CreateDatabase to create a database.

database_name = 'testDatabase'
try:
    database = client.CreateDatabase({'id': database_name})
except errors.HTTPFailure:
    database = client.ReadDatabase("dbs/" + database_name)

Create a container

This example creates a container with 400 RU/s as the throughput, using CosmosClient.CreateContainer. If a container with the same name already exists in the database (generating a 409 Conflict error), the existing container is obtained instead.

import azure.cosmos.documents as documents
container_definition = {
    'id': 'products',
    'partitionKey': {
        'paths': ['/productName'],
        'kind': documents.PartitionKind.Hash
    }
}
try:
    container = client.CreateContainer(
        "dbs/" + database['id'], container_definition, {'offerThroughput': 400})
except errors.HTTPFailure as e:
    if e.status_code == http_constants.StatusCodes.CONFLICT:
        container = client.ReadContainer(
            "dbs/" + database['id'] + "/colls/" + container_definition['id'])
    else:
        raise e

Replace throughput for a container

A single offer object exists per container. This object contains information regarding the container's throughput. This example retrieves the offer object using CosmosClient.QueryOffers, and modifies the offer object and replaces the throughput for the container using CosmosClient.ReplaceOffer.

# Get the offer for the container
offers = list(client.QueryOffers(
    "Select * from root r where r.offerResourceId='" + container['_rid'] + "'"))
offer = offers[0]
print("current throughput for " + container['id'] + ": " +
      str(offer['content']['offerThroughput']))

# Replace the offer with a new throughput
offer['content']['offerThroughput'] = 1000
client.ReplaceOffer(offer['_self'], offer)
print("new throughput for " + container['id'] + ": " +
      str(offer['content']['offerThroughput']))

Get an existing container

Retrieve an existing container from the database using CosmosClient.ReadContainer:

database_id = 'testDatabase'
container_id = 'products'
container = client.ReadContainer("dbs/" + database_id + "/colls/" + container_id)

Insert data

To insert items into a container, pass a dictionary containing your data to CosmosClient.UpsertItem. Each item you add to a container must include an id key with a value that uniquely identifies the item within the container.

This example inserts several items into the container, each with a unique id:

for i in range(1, 10):
    client.UpsertItem(
        "dbs/" + database_id + "/colls/" + container_id,
        {
             'id': 'item{0}'.format(i),
             'productName': 'Widget',
             'productModel': 'Model {0}'.format(i)
        }
    )

Delete data

To delete items from a container, use CosmosClient.DeleteItem. The SQL API in Cosmos DB does not support the SQL DELETE statement.

for item in client.QueryItems(
    "dbs/" + database_id + "/colls/" + container_id,
    'SELECT * FROM products p WHERE p.productModel = "DISCONTINUED"',
    {'enableCrossPartitionQuery': True}):

    client.DeleteItem(
        "dbs/" + database_id + "/colls/" + container_id + "/docs/" + item['id'],
        {'partitionKey': 'Pager'})

Query the database

A Cosmos DB SQL API database supports querying the items in a container with CosmosClient.QueryItems using SQL-like syntax.

This example queries a container for items with a specific id:

database = client.get_database_client(database_name)
container = database.get_container_client(container_name)

# Enumerate the returned items
import json
for item in client.QueryItems(
    "dbs/" + database_id + "/colls/" + container_id,
    'SELECT * FROM ' + container_id + ' r WHERE r.id="item3"',
    {'enableCrossPartitionQuery': True}):

    print(json.dumps(item, indent=True))

NOTE: Although you can specify any value for the container name in the FROM clause, we recommend you use the container name for consistency.

Perform parameterized queries by passing a dictionary containing the parameters and their values to CosmosClient.QueryItems:

discontinued_items = client.QueryItems(
    "dbs/" + database_id + "/colls/" + container_id,
    {
        'query': 'SELECT * FROM root r WHERE r.id=@id',
        'parameters': [
            {'name': '@id', 'value': 'item3'}
        ]
    },
    {'enableCrossPartitionQuery': True})
for item in discontinued_items:
    print(json.dumps(item, indent=True))

For more information on querying Cosmos DB databases using the SQL API, see Query Azure Cosmos DB data with SQL queries.

Modify container properties

Certain properties of an existing container can be modified. This example sets the default time to live (TTL) for items in the container to 10 seconds:

container = client.ReadContainer("dbs/" + database_id + "/colls/" + container_id)
container['defaultTtl'] = 10
modified_container = client.ReplaceContainer(
    "dbs/" + database_id + "/colls/" + container_id, container)
# Display the new TTL setting for the container
print(json.dumps(modified_container['defaultTtl']))

For more information on TTL, see Time to Live for Azure Cosmos DB data.

Troubleshooting

General

When you interact with Cosmos DB using the Python SDK, errors returned by the service correspond to the same HTTP status codes returned for REST API requests:

HTTP Status Codes for Azure Cosmos DB

For example, if you try to create a container using an ID (name) that's already in use in your Cosmos DB database, a 409 error is returned, indicating the conflict. In the following snippet, the error is handled gracefully by catching the exception and displaying additional information about the error.

try:
    container = client.CreateContainer("dbs/" + database['id'], container_definition)
except errors.HTTPFailure as e:
    if e.status_code == http_constants.StatusCodes.CONFLICT:
        print("""Error creating container
HTTP status code 409: The ID (name) provided for the container is already in use.
The container name must be unique within the database.""")
    else:
        raise e

Next steps

For more extensive documentation on the Cosmos DB service, see the Azure Cosmos DB documentation on docs.microsoft.com.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Changes in 3.2.0 :

  • Replace pkg_resource style namespace package with native(Python3) and pkgutil(Python2) style.
  • In releases 3.1.1 and 3.1.2, some files were added wheels incorrectly (submodules container, database, offer, partition_key, permission, scripts, user, cosmos_client_connection). These are part of the 4.x API, and have been removed.

Changes in 3.1.2 :

  • Added suport for connection retry configuration

Changes in 3.1.1 :

  • Bug fix in orderby queries to honor maxItemCount

Changes in 3.1.0 :

  • Added support for picking up endpoint and key from environment variables

Changes in 3.0.2 :

  • Added Support for MultiPolygon Datatype
  • Bug Fix in Session Read Retry Policy
  • Bug Fix for Incorrect padding issues while decoding base 64 strings

Changes in 3.0.1 :

  • Bug fix in LocationCache
  • Bug fix endpoint retry logic
  • Fixed documentation

Changes in 3.0.0 :

  • Multi-region write support added
  • Naming changes
    • DocumentClient to CosmosClient
    • Collection to Container
    • Document to Item
    • Package name updated to "azure-cosmos"
    • Namespace updated to "azure.cosmos"

Changes in 2.3.3 :

  • Added support for proxy
  • Added support for reading change feed
  • Added support for collection quota headers
  • Bugfix for large session tokens issue
  • Bugfix for ReadMedia API
  • Bugfix in partition key range cache

Changes in 2.3.2 :

  • Added support for default retries on connection issues.

Changes in 2.3.1 :

  • Updated documentation to reference Azure Cosmos DB instead of Azure DocumentDB.

Changes in 2.3.0 :

Changes in 2.2.1 :

  • bugfix for aggregate dict
  • bugfix for trimming slashes in the resource link
  • tests for unicode encoding

Changes in 2.2.0 :

  • Added support for Request Unit per Minute (RU/m) feature.
  • Added support for a new consistency level called ConsistentPrefix.

Changes in 2.1.0 :

  • Added support for aggregation queries (COUNT, MIN, MAX, SUM, and AVG).
  • Added an option for disabling SSL verification when running against DocumentDB Emulator.
  • Removed the restriction of dependent requests module to be exactly 2.10.0.
  • Lowered minimum throughput on partitioned collections from 10,100 RU/s to 2500 RU/s.
  • Added support for enabling script logging during stored procedure execution.
  • REST API version bumped to '2017-01-19' with this release.

Changes in 2.0.1 :

  • Made editorial changes to documentation comments.

Changes in 2.0.0 :

  • Added support for Python 3.5.
  • Added support for connection pooling using the requests module.
  • Added support for session consistency.
  • Added support for TOP/ORDERBY queries for partitioned collections.

Changes in 1.9.0 :

  • Added retry policy support for throttled requests. (Throttled requests receive a request rate too large exception, error code 429.) By default, DocumentDB retries nine times for each request when error code 429 is encountered, honoring the retryAfter time in the response header. A fixed retry interval time can now be set as part of the RetryOptions property on the ConnectionPolicy object if you want to ignore the retryAfter time returned by server between the retries. DocumentDB now waits for a maximum of 30 seconds for each request that is being throttled (irrespective of retry count) and returns the response with error code 429. This time can also be overriden in the RetryOptions property on ConnectionPolicy object.

  • DocumentDB now returns x-ms-throttle-retry-count and x-ms-throttle-retry-wait-time-ms as the response headers in every request to denote the throttle retry count and the cummulative time the request waited between the retries.

  • Removed the RetryPolicy class and the corresponding property (retry_policy) exposed on the document_client class and instead introduced a RetryOptions class exposing the RetryOptions property on ConnectionPolicy class that can be used to override some of the default retry options.

Changes in 1.8.0 :

  • Added the support for geo-replicated database accounts.
  • Test fixes to move the global host and masterKey into the individual test classes.

Changes in 1.7.0 :

  • Added the support for Time To Live(TTL) feature for documents.

Changes in 1.6.1 :

  • Bug fixes related to server side partitioning to allow special characters in partitionkey path.

Changes in 1.6.0 :

  • Added the support for server side partitioned collections feature.

Changes in 1.5.0 :

  • Added Client-side sharding framework to the SDK. Implemented HashPartionResolver and RangePartitionResolver classes.

Changes in 1.4.2 :

  • Implement Upsert. New UpsertXXX methods added to support Upsert feature.
  • Implement ID Based Routing. No public API changes, all changes internal.

Changes in 1.3.0 :

  • Release skipped to bring version number in alignment with other SDKs

Changes in 1.2.0 :

  • Supports GeoSpatial index.
  • Validates id property for all resources. Ids for resources cannot contain ?, /, #, \, characters or end with a space.
  • Adds new header "index transformation progress" to ResourceResponse.

Changes in 1.1.0 :

  • Implements V2 indexing policy

Changes in 1.0.1 :

  • Supports proxy connection

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

azure-cosmos-3.2.0.tar.gz (154.6 kB view hashes)

Uploaded Source

Built Distribution

azure_cosmos-3.2.0-py2.py3-none-any.whl (106.6 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page