Skip to main content

Abstraction layer for interacting with Microsoft Dataverse Web API using Python.

Project description

dataverse-api

uv Ruff codecov

The dataverse-api package is an abstraction layer developed for allowing simple interaction with Microsoft Dataverse Web API.

Table of Contents

Description

The main goal of this project was to enable some use-cases against Dataverse that I wanted to explore for a work assignment, while getting some experience in programming and testing out different ways of setting up the codebase.

The functionality I have built into this Dataverse wrapper constitutes the functionality I have wanted to explore myself.

Most important is to enable creating, upserting, updating and deleting rows of data into Dataverse tables using common data structures, and implementing choices on how these requests are to be formed. For example, when creating new rows, the user can choose between individual POST requests per row, combining data into batch requests against the $batch endpoint, or even to use the CreateMultiple Dataverse action.

The framework is written in Python 3.11, seeing as this runtime is available in the current release of Azure Functions.

Getting started

Usage is fairly simple - authentication must be handled by the user. The DataverseClient simply accepts an already authorized requests.Session with which to handle API requests.

I suggest using msal and msal-requests-auth for authenticating the Session. The examples below include this way of implementing auth:

import os

from msal import ConfidentialClientApplication
from msal_requests_auth.auth import ClientCredentialAuth
from requests import Session

from dataverse_api import DataverseClient

# Prepare Auth
app_reg = ConfidentialClientApplication(
    client_id=os.getenv("app_id"),
    client_credential=os.getenv("client_secret"),
    authority=os.getenv("authority_url"),
)
environment_url = os.getenv("environment_url")
auth = ClientCredentialAuth(
    client=app_reg,
    scopes=[environment_url + "/.default"]
)

# Prepare Session
session = Session()
session.auth = auth

# Instantiate DataverseClient
client = DataverseClient(session=session, environment_url=environment_url)

# Instantiate interface to Entity
entity = client.entity(logical_name="organization")

# Read data!
entity.read(select=["name"])

Development environment

uv is used for managing dependencies. To develop dataverse-api, follow the below steps to set up your local environment:

  1. Install uv if you haven't already.

  2. Clone repository:

$ git clone git@github.com:MarcusRisanger/dataverse-api.git
  1. Move into the newly created local repository:
$ cd dataverse-api
  1. Create virtual environment and install dependencies:
$ uv venv --python 3.11
$ uv sync

Code requirements

All code must pass ruff style checks to be merged. It is recommended to install pre-commit hooks to ensure this locally before commiting code:

$ uv tool install pre-commit

Each public method, class and module should have docstrings. Docstrings are written in the Numpy style.

Testing

To run tests on multiple Python versions, we use tox:

$ uv tool install tox --with tox-uv

When tox is installed, simply run:

$ tox run

To Do:

  • Documentation ..
  • Metadata
    • Choice
    • Multichoice
    • Money
  • Entity implementation
    • Illegal file extensions
    • Upload file
    • Upload image
    • Upload large file
    • Add / remove relationships
    • Add relationships (single/collection valued) as attr on entity
    • Add columns as attr on entity
  • Add Tests:
    • Add column
    • Remove column
    • Add alternate key
    • Remove alternate key

Usage

DataverseClient

Initialize DataverseClient

For now, I've coded the framework around the requests library, for good and bad! In the future, I will consider generalizing further, letting the user pass an authenticated requests handler of choice to the framework by specifying a Protocol class instead.

To instantiate, pass a requests.Session together with a Dataverse environment URL to the DataverseClient constructor:

session = Session()
session.auth = auth
environment_url = os.getenv("dataverse_url")

client = DataverseClient(session=session, environment_url=environment_url)

Create new Entity

It is possible to create a new Entity using the DataverseClient. This requires a full EntityMetadata definition according to Dataverse standards. You can make this yourself and follow the MetadataDumper protocol, or use the provided define_entity function.

The define_label function makes it simple to generate Label metadata with correct LocalizedLabels in its payload.

In the example below, the optional return_representation argument has been set to True to receive the full Entity metadata definition as created by Dataverse as part of the server response. The response can be parsed by EntityMetadata classmethod to get a full fledged object for editing.

from dataverse_api.metadata.attributes import StringAttributeMetadata
from dataverse_api.metadata.entity import EntityMetadata, define_entity
from dataverse_api.utils.labels import define_label

new_entity = define_entity(
    schema_name="new_name",
    attributes=[StringAttributeMetadata(
        schema_name="new_primary_col",
        is_primary_name = True,
        description=define_label("Primary column for Entity."),
        display_name=define_label("Autonumber Column"),
        auto_number_format="{SEQNUM:6}-#-{RANDSTRING:3}")],
    description=define_label("Entity Created by Client"),
    display_name=define_label("Programmatically Created Table")
)

resp = client.create_entity(new_entity, return_representation=True)
entity_meta = EntityMetadata.model_validate_dataverse(resp.json())

Update existing Entity

You can update an existing Entity definition easily by retrieving the Entity metadata definition, and reupload an adjusted version.

Below is a simple example. Note that this method also supports return_representation in the same manner as the DataverseClient.create_entity() method, if you want to return the edited Entity metadata as persisted in Dataverse.

metadata = client.get_entity_definition("new_name")
metadata.display_name.localized_labels[0].label = "Overridden Display Name"

client.update_entity(metadata)

DataverseEntity

Initialize interface with Entity

To initializes an interface with a specific Dataverse Entity, use the DataverseClient.entity() method. It returns a DataverseEntity object that allows interaction with this specific entity.

foo = client.entity(logical_name="foo")

As of now, only LogicalName is supported for instantiating a new DataverseEntity object.

Read

The DataverseEntity.read() method has been furnished with the necessary arguments to do querying as specified in the Microsoft Dataverse documentation.

A simple example:

data = foo.read(select=["name","address"], filter="salary gt 10000", top=5, order_by="salary desc")

Create

To create rows, you can use a DataFrame supported by narwhals or a simple construct like a list of dicts, where each dict contains the data for a single row.

Below is an example of creating rows in the Entity by passing a dataframe and specifying that the creation method should be the CreateMultiple web API Action. The return_created argument can be set to True if you need the IDs as reference.

foo.create(data=df, mode="multiple", return_created=True)

Note that the different modes provide different content when return_created is set to True - the script simply sets a Prefer header to include created data in the server response.

For now the user may choose how to handle this based on the list of requests.Response objects that will be returned by the method.

Upsert

Upserting data into Dataverse is simple. If you are just updating existing data you may have the URI (Primary Attribute ID) in your data. You can then omit the alternate_key argument.

foo.upsert(data=df, alternate_key="my_key", mode="batch")

Delete

TBD

Add and remove Attributes

TBD

Add and remove Alternate Keys

TBD

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataverse_api-1.3.0.tar.gz (103.5 kB view details)

Uploaded Source

Built Distribution

dataverse_api-1.3.0-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file dataverse_api-1.3.0.tar.gz.

File metadata

  • Download URL: dataverse_api-1.3.0.tar.gz
  • Upload date:
  • Size: 103.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataverse_api-1.3.0.tar.gz
Algorithm Hash digest
SHA256 38ee8df04724cada4f836b2028a45d0a700d245f854ff7524b3161d4f5417745
MD5 d02909dcddabc42e5d4b2c112b04659c
BLAKE2b-256 5dd88564fa8c8b644afec894372b117cdbf8f5b4a44a9951026861416008a985

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataverse_api-1.3.0.tar.gz:

Publisher: release.yml on MarcusRisanger/Dataverse-API

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataverse_api-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: dataverse_api-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 30.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataverse_api-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 df1627337bef7404cd0d84d146f5629aefae81bd67e5ac1dd989b37925d644fb
MD5 44fb50b63c43f880f07da6913672b435
BLAKE2b-256 ebc412cbc6d24e1806df88d2c853e9ad1d957334d94e723893e09ae299b02149

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataverse_api-1.3.0-py3-none-any.whl:

Publisher: release.yml on MarcusRisanger/Dataverse-API

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page