Microsoft Azure Cosmos Client Library for Python
Project description
Azure Cosmos DB SQL API client library for Python
Azure Cosmos DB is a globally distributed, multi-model database service that supports document, key-value, wide-column, and graph databases.
Use the Azure Cosmos DB SQL API SDK for Python to manage databases and the JSON documents they contain in this NoSQL database service.
- Create Cosmos DB databases and modify their settings
- Create and modify containers to store collections of JSON documents
- Create, read, update, and delete the items (JSON documents) in your containers
- Query the documents in your database using SQL-like syntax
SDK source code | Package (PyPI) | API reference documentation | Product documentation | Samples
This SDK is used for the SQL API. For all other APIs, please check the Azure Cosmos DB documentation to evaluate the best SDK for your project.
Getting started
Prerequisites
- Azure subscription - Create a free account
- Azure Cosmos DB account - SQL API
- Python 2.7 or 3.5.3+
If you need a Cosmos DB SQL API account, you can create one with this Azure CLI command:
az cosmosdb create --resource-group <resource-group-name> --name <cosmos-account-name>
Install the package
pip install azure-cosmos
Configure a virtual environment (optional)
Although not required, you can keep your base system and Azure SDK environments isolated from one another if you use a virtual environment. Execute the following commands to configure and then enter a virtual environment with venv:
python3 -m venv azure-cosmosdb-sdk-environment
source azure-cosmosdb-sdk-environment/bin/activate
Authenticate the client
Interaction with Cosmos DB starts with an instance of the CosmosClient class. You need an account, its URI, and one of its account keys to instantiate the client object.
Use the Azure CLI snippet below to populate two environment variables with the database account URI and its primary master key (you can also find these values in the Azure portal). The snippet is formatted for the Bash shell.
RES_GROUP=<resource-group-name>
ACCT_NAME=<cosmos-db-account-name>
export ACCOUNT_URI=$(az cosmosdb show --resource-group $RES_GROUP --name $ACCT_NAME --query documentEndpoint --output tsv)
export ACCOUNT_KEY=$(az cosmosdb list-keys --resource-group $RES_GROUP --name $ACCT_NAME --query primaryMasterKey --output tsv)
Create the client
Once you've populated the ACCOUNT_URI
and ACCOUNT_KEY
environment variables, you can create the CosmosClient.
from azure.cosmos import CosmosClient
import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
Key concepts
Once you've initialized a CosmosClient, you can interact with the primary resource types in Cosmos DB:
-
Database: A Cosmos DB account can contain multiple databases. When you create a database, you specify the API you'd like to use when interacting with its documents: SQL, MongoDB, Gremlin, Cassandra, or Azure Table. Use the DatabaseProxy object to manage its containers.
-
Container: A container is a collection of JSON documents. You create (insert), read, update, and delete items in a container by using methods on the ContainerProxy object.
-
Item: An Item is the dictionary-like representation of a JSON document stored in a container. Each Item you add to a container must include an
id
key with a value that uniquely identifies the item within the container.
For more information about these resources, see Working with Azure Cosmos databases, containers and items.
Limitations
As of August 2020 the features below are not yet supported.
- Bulk/Batch processing
- Group By queries
- Direct TCP Mode access
- Language Native async i/o
Limitations Workaround
If you want to use Python SDK to perform bulk inserts to Cosmos DB, the best alternative is to use stored procedures to write multiple items with the same partition key.
Examples
The following sections provide several code snippets covering some of the most common Cosmos DB tasks, including:
- Create a database
- Create a container
- Create an analytical store enabled container
- Get an existing container
- Insert data
- Delete data
- Query the database
- Get database properties
- Modify container properties
Create a database
After authenticating your CosmosClient, you can work with any resource in the account. The code snippet below creates a SQL API database, which is the default when no API is specified when create_database is invoked.
from azure.cosmos import CosmosClient, exceptions
import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
try:
database = client.create_database(database_name)
except exceptions.CosmosResourceExistsError:
database = client.get_database_client(database_name)
Create a container
This example creates a container with default settings. If a container with the same name already exists in the database (generating a 409 Conflict
error), the existing container is obtained instead.
from azure.cosmos import CosmosClient, PartitionKey, exceptions
import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
container_name = 'products'
try:
container = database.create_container(id=container_name, partition_key=PartitionKey(path="/productName"))
except exceptions.CosmosResourceExistsError:
container = database.get_container_client(container_name)
except exceptions.CosmosHttpResponseError:
raise
Create an analytical store enabled container
This example creates a container with Analytical Store enabled, for reporting, BI, AI, and Advanced Analytics with Azure Synapse Link.
The options for analytical_storage_ttl are:
- 0 or Null or not informed: Not enabled.
- -1: The data will be stored infinitely.
- Any other number: the actual ttl, in seconds.
container_name = 'products'
try:
container = database.create_container(id=container_name, partition_key=PartitionKey(path="/productName"),analytical_storage_ttl=-1)
except exceptions.CosmosResourceExistsError:
container = database.get_container_client(container_name)
except exceptions.CosmosHttpResponseError:
raise
The preceding snippet also handles the CosmosHttpResponseError exception if the container creation failed. For more information on error handling and troubleshooting, see the Troubleshooting section.
The preceding snippet also handles the CosmosHttpResponseError exception if the container creation failed. For more information on error handling and troubleshooting, see the Troubleshooting section.
Get an existing container
Retrieve an existing container from the database:
from azure.cosmos import CosmosClient
import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
container_name = 'products'
container = database.get_container_client(container_name)
Insert data
To insert items into a container, pass a dictionary containing your data to ContainerProxy.upsert_item. Each item you add to a container must include an id
key with a value that uniquely identifies the item within the container.
This example inserts several items into the container, each with a unique id
:
from azure.cosmos import CosmosClient
import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
container_name = 'products'
container = database.get_container_client(container_name)
for i in range(1, 10):
container.upsert_item({
'id': 'item{0}'.format(i),
'productName': 'Widget',
'productModel': 'Model {0}'.format(i)
}
)
Delete data
To delete items from a container, use ContainerProxy.delete_item. The SQL API in Cosmos DB does not support the SQL DELETE
statement.
from azure.cosmos import CosmosClient
import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
container_name = 'products'
container = database.get_container_client(container_name)
for item in container.query_items(
query='SELECT * FROM products p WHERE p.productModel = "Model 2"',
enable_cross_partition_query=True):
container.delete_item(item, partition_key='Widget')
NOTE: If you are using partitioned collection, the value of the
partitionKey
in the example code above, should be set to the value of the partition key for this particular item, not the name of the partition key column in your collection. This holds true for both point reads and deletes.
Query the database
A Cosmos DB SQL API database supports querying the items in a container with ContainerProxy.query_items using SQL-like syntax.
This example queries a container for items with a specific id
:
from azure.cosmos import CosmosClient
import os
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
container_name = 'products'
container = database.get_container_client(container_name)
# Enumerate the returned items
import json
for item in container.query_items(
query='SELECT * FROM mycontainer r WHERE r.id="item3"',
enable_cross_partition_query=True):
print(json.dumps(item, indent=True))
NOTE: Although you can specify any value for the container name in the
FROM
clause, we recommend you use the container name for consistency.
Perform parameterized queries by passing a dictionary containing the parameters and their values to ContainerProxy.query_items:
discontinued_items = container.query_items(
query='SELECT * FROM products p WHERE p.productModel = @model',
parameters=[
dict(name='@model', value='Model 7')
],
enable_cross_partition_query=True
)
for item in discontinued_items:
print(json.dumps(item, indent=True))
For more information on querying Cosmos DB databases using the SQL API, see Query Azure Cosmos DB data with SQL queries.
Get database properties
Get and display the properties of a database:
from azure.cosmos import CosmosClient
import os
import json
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
properties = database.read()
print(json.dumps(properties))
Modify container properties
Certain properties of an existing container can be modified. This example sets the default time to live (TTL) for items in the container to 10 seconds:
from azure.cosmos import CosmosClient, PartitionKey
import os
import json
url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
container_name = 'products'
container = database.get_container_client(container_name)
database.replace_container(
container,
partition_key=PartitionKey(path="/productName"),
default_ttl=10,
)
# Display the new TTL setting for the container
container_props = container.read()
print(json.dumps(container_props['defaultTtl']))
For more information on TTL, see Time to Live for Azure Cosmos DB data.
Troubleshooting
General
When you interact with Cosmos DB using the Python SDK, exceptions returned by the service correspond to the same HTTP status codes returned for REST API requests:
HTTP Status Codes for Azure Cosmos DB
For example, if you try to create a container using an ID (name) that's already in use in your Cosmos DB database, a 409
error is returned, indicating the conflict. In the following snippet, the error is handled gracefully by catching the exception and displaying additional information about the error.
try:
database.create_container(id=container_name, partition_key=PartitionKey(path="/productName"))
except exceptions.CosmosResourceExistsError:
print("""Error creating container
HTTP status code 409: The ID (name) provided for the container is already in use.
The container name must be unique within the database.""")
Logging
This library uses the standard logging library for logging. Basic information about HTTP sessions (URLs, headers, etc.) is logged at INFO level.
Detailed DEBUG level logging, including request/response bodies and unredacted
headers, can be enabled on a client with the logging_enable
argument:
import sys
import logging
from azure.cosmos import CosmosClient
# Create a logger for the 'azure' SDK
logger = logging.getLogger('azure')
logger.setLevel(logging.DEBUG)
# Configure a console output
handler = logging.StreamHandler(stream=sys.stdout)
logger.addHandler(handler)
# This client will log detailed information about its HTTP sessions, at DEBUG level
client = CosmosClient(url, credential=key, logging_enable=True)
Similarly, logging_enable
can enable detailed logging for a single operation,
even when it isn't enabled for the client:
database = client.create_database(database_name, logging_enable=True)
Next steps
For more extensive documentation on the Cosmos DB service, see the Azure Cosmos DB documentation on docs.microsoft.com.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
4.2.0 (2020-10-08)
Bug fixes
- Fixed bug where continuation token is not honored when query_iterable is used to get results by page. Issue #13265.
- Fixed bug where resource tokens not being honored for document reads and deletes. Issue #13634.
New features
- Added support for passing partitionKey while querying changefeed. Issue #11689.
4.1.0 (2020-08-10)
- Added deprecation warning for "lazy" indexing mode. The backend no longer allows creating containers with this mode and will set them to consistent instead.
New features
- Added the ability to set the analytical storage TTL when creating a new container.
Bug fixes
- Fixed support for dicts as inputs for get_client APIs.
- Fixed Python 2/3 compatibility in query iterators.
- Fixed type hint error. Issue #12570 - thanks @sl-sandy.
- Fixed bug where options headers were not added to upsert_item function. Issue #11791 - thank you @aalapatirvbd.
- Fixed error raised when a non string ID is used in an item. It now raises TypeError rather than AttributeError. Issue #11793 - thank you @Rabbit994.
4.0.0 (2020-05-20)
- Stable release.
- Added HttpLoggingPolicy to pipeline to enable passing in a custom logger for request and response headers.
4.0.0b6
- Fixed bug in synchronized_request for media APIs.
- Removed MediaReadMode and MediaRequestTimeout from ConnectionPolicy as media requests are not supported.
4.0.0b5
- azure.cosmos.errors module deprecated and replaced by azure.cosmos.exceptions
- The access condition parameters (
access_condition
,if_match
,if_none_match
) have been deprecated in favor of separatematch_condition
andetag
parameters. - Fixed bug in routing map provider.
- Added query Distinct, Offset and Limit support.
- Default document query execution context now used for
- ChangeFeed queries
- single partition queries (partitionkey, partitionKeyRangeId is present in options)
- Non document queries
- Errors out for aggregates on multiple partitions, with enable cross partition query set to true, but no "value" keyword present
- Hits query plan endpoint for other scenarios to fetch query plan
- Added
__repr__
support for Cosmos entity objects. - Updated documentation.
4.0.0b4
- Added support for a
timeout
keyword argument to all operations to specify an absolute timeout in seconds within which the operation must be completed. If the timeout value is exceeded, aazure.cosmos.errors.CosmosClientTimeoutError
will be raised. - Added a new
ConnectionRetryPolicy
to manage retry behaviour during HTTP connection errors. - Added new constructor and per-operation configuration keyword arguments:
retry_total
- Maximum retry attempts.retry_backoff_max
- Maximum retry wait time in seconds.retry_fixed_interval
- Fixed retry interval in milliseconds.retry_read
- Maximum number of socket read retry attempts.retry_connect
- Maximum number of connection error retry attempts.retry_status
- Maximum number of retry attempts on error status codes.retry_on_status_codes
- A list of specific status codes to retry on.retry_backoff_factor
- Factor to calculate wait time between retry attempts.
4.0.0b3
- Added
create_database_if_not_exists()
andcreate_container_if_not_exists
functionalities to CosmosClient and Database respectively.
4.0.0b2
Version 4.0.0b2 is the second iteration in our efforts to build a more Pythonic client library.
Breaking changes
- The client connection has been adapted to consume the HTTP pipeline defined in
azure.core.pipeline
. - Interactive objects have now been renamed as proxies. This includes:
Database
->DatabaseProxy
User
->UserProxy
Container
->ContainerProxy
Scripts
->ScriptsProxy
- The constructor of
CosmosClient
has been updated:- The
auth
parameter has been renamed tocredential
and will now take an authentication type directly. This means the master key value, a dictionary of resource tokens, or a list of permissions can be passed in. However the old dictionary format is still supported. - The
connection_policy
parameter has been made a keyword only parameter, and while it is still supported, each of the individual attributes of the policy can now be passed in as explicit keyword arguments:request_timeout
media_request_timeout
connection_mode
media_read_mode
proxy_config
enable_endpoint_discovery
preferred_locations
multiple_write_locations
- The
- A new classmethod constructor has been added to
CosmosClient
to enable creation via a connection string retrieved from the Azure portal. - Some
read_all
operations have been renamed tolist
operations:CosmosClient.read_all_databases
->CosmosClient.list_databases
Container.read_all_conflicts
->ContainerProxy.list_conflicts
Database.read_all_containers
->DatabaseProxy.list_containers
Database.read_all_users
->DatabaseProxy.list_users
User.read_all_permissions
->UserProxy.list_permissions
- All operations that take
request_options
orfeed_options
parameters, these have been moved to keyword only parameters. In addition, while these options dictionaries are still supported, each of the individual options within the dictionary are now supported as explicit keyword arguments. - The error heirarchy is now inherited from
azure.core.AzureError
instead ofCosmosError
which has been removed.HTTPFailure
has been renamed toCosmosHttpResponseError
JSONParseFailure
has been removed and replaced byazure.core.DecodeError
- Added additional errors for specific response codes:
CosmosResourceNotFoundError
for status 404CosmosResourceExistsError
for status 409CosmosAccessConditionFailedError
for status 412
CosmosClient
can now be run in a context manager to handle closing the client connection.- Iterable responses (e.g. query responses and list responses) are now of type
azure.core.paging.ItemPaged
. The methodfetch_next_block
has been replaced by a secondary iterator, accessed by theby_page
method.
4.0.0b1
Version 4.0.0b1 is the first preview of our efforts to create a user-friendly and Pythonic client library for Azure Cosmos. For more information about this, and preview releases of other Azure SDK libraries, please visit https://aka.ms/azure-sdk-preview1-python.
Breaking changes: New API design
-
Operations are now scoped to a particular client:
CosmosClient
: This client handles account-level operations. This includes managing service properties and listing the databases within an account.Database
: This client handles database-level operations. This includes creating and deleting containers, users and stored procedurs. It can be accessed from aCosmosClient
instance by name.Container
: This client handles operations for a particular container. This includes querying and inserting items and managing properties.User
: This client handles operations for a particular user. This includes adding and deleting permissions and managing user properties.
These clients can be accessed by navigating down the client hierarchy using the
get_<child>_client
method. For full details on the new API, please see the reference documentation. -
Clients are accessed by name rather than by Id. No need to concatenate strings to create links.
-
No more need to import types and methods from individual modules. The public API surface area is available directly in the
azure.cosmos
package. -
Individual request properties can be provided as keyword arguments rather than constructing a separate
RequestOptions
instance.
3.0.2
- Added Support for MultiPolygon Datatype
- Bug Fix in Session Read Retry Policy
- Bug Fix for Incorrect padding issues while decoding base 64 strings
3.0.1
- Bug fix in LocationCache
- Bug fix endpoint retry logic
- Fixed documentation
3.0.0
- Multi-region write support added
- Naming changes
- DocumentClient to CosmosClient
- Collection to Container
- Document to Item
- Package name updated to "azure-cosmos"
- Namespace updated to "azure.cosmos"
2.3.3
- Added support for proxy
- Added support for reading change feed
- Added support for collection quota headers
- Bugfix for large session tokens issue
- Bugfix for ReadMedia API
- Bugfix in partition key range cache
2.3.2
- Added support for default retries on connection issues.
2.3.1
- Updated documentation to reference Azure Cosmos DB instead of Azure DocumentDB.
2.3.0
- This SDK version requires the latest version of Azure Cosmos DB Emulator available for download from https://aka.ms/cosmosdb-emulator.
2.2.1
- bugfix for aggregate dict
- bugfix for trimming slashes in the resource link
- tests for unicode encoding
2.2.0
- Added support for Request Unit per Minute (RU/m) feature.
- Added support for a new consistency level called ConsistentPrefix.
2.1.0
- Added support for aggregation queries (COUNT, MIN, MAX, SUM, and AVG).
- Added an option for disabling SSL verification when running against DocumentDB Emulator.
- Removed the restriction of dependent requests module to be exactly 2.10.0.
- Lowered minimum throughput on partitioned collections from 10,100 RU/s to 2500 RU/s.
- Added support for enabling script logging during stored procedure execution.
- REST API version bumped to '2017-01-19' with this release.
2.0.1
- Made editorial changes to documentation comments.
2.0.0
- Added support for Python 3.5.
- Added support for connection pooling using the requests module.
- Added support for session consistency.
- Added support for TOP/ORDERBY queries for partitioned collections.
1.9.0
-
Added retry policy support for throttled requests. (Throttled requests receive a request rate too large exception, error code 429.) By default, DocumentDB retries nine times for each request when error code 429 is encountered, honoring the retryAfter time in the response header. A fixed retry interval time can now be set as part of the RetryOptions property on the ConnectionPolicy object if you want to ignore the retryAfter time returned by server between the retries. DocumentDB now waits for a maximum of 30 seconds for each request that is being throttled (irrespective of retry count) and returns the response with error code 429. This time can also be overriden in the RetryOptions property on ConnectionPolicy object.
-
DocumentDB now returns x-ms-throttle-retry-count and x-ms-throttle-retry-wait-time-ms as the response headers in every request to denote the throttle retry count and the cummulative time the request waited between the retries.
-
Removed the RetryPolicy class and the corresponding property (retry_policy) exposed on the document_client class and instead introduced a RetryOptions class exposing the RetryOptions property on ConnectionPolicy class that can be used to override some of the default retry options.
1.8.0
- Added the support for geo-replicated database accounts.
- Test fixes to move the global host and masterKey into the individual test classes.
1.7.0
- Added the support for Time To Live(TTL) feature for documents.
1.6.1
- Bug fixes related to server side partitioning to allow special characters in partitionkey path.
1.6.0
- Added the support for server side partitioned collections feature.
1.5.0
- Added Client-side sharding framework to the SDK. Implemented HashPartionResolver and RangePartitionResolver classes.
1.4.2
- Implement Upsert. New UpsertXXX methods added to support Upsert feature.
- Implement ID Based Routing. No public API changes, all changes internal.
1.3.0
- Release skipped to bring version number in alignment with other SDKs
1.2.0
- Supports GeoSpatial index.
- Validates id property for all resources. Ids for resources cannot contain ?, /, #, \, characters or end with a space.
- Adds new header "index transformation progress" to ResourceResponse.
1.1.0
- Implements V2 indexing policy
1.0.1
- Supports proxy connection
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for azure_cosmos-4.2.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0edcc3acd09e5c422b6cf07a34f9faa418414ba72eea3db2a3dcfaa9b1ee610 |
|
MD5 | 2816cb52602d224546a1b1780db489f0 |
|
BLAKE2b-256 | 6a66683228000d19273676f8bfc65a3881251cdd405674f692b8fa832b3e1aed |