A wrapper around the requests session object to make interacting with the DSpace API easier

These details have not been verified by PyPI

Project links

Project description

DSpace Requests Wrapper

A wrapper around the Request library's Session which makes API calls to DSpace easier to manage.

This library provides a class, DSpaceSession, which handles username and password based authentication, not Shibboleth authentication.

It handles authentication bearer tokens and CSRF tokens on behalf of the user.

The endpoint URL should be in the form "https://your.dspace.domain.here.org/server". It should present the user with the HAL browser when visiting from a web browser. Most API endpoints append "/api/path/here" to the server endpoint, except the actuator endpoints.

Per the DSpace RestContract documentation:
"The client MUST store/keep a copy of this CSRF token (usually by watching for the DSPACE-XSRF-TOKEN header in every response), and update that stored copy whenever a new token is sent."
https://github.com/DSpace/RestContract/blob/main/csrf-tokens.md
In DSpaceSession, we override the request() method so that on every call to the API, the session's X-XSRF-TOKEN request header is updated with new versions of the CSRF token from the DSPACE-XSRF-TOKEN response header.

DSpaceSession also sends a form-encoded POST request to /api/authn/login with the provided username and password on initialization. The API returns a JWT bearer token, which is stored in the session's Authentication header. When making a request, it checks that the stored bearer token isn't within 5 minutes of expiring. If it is, it sends a POST request to /api/authn/login with no parameters, and stores the new bearer token in the session's Authentication header.

Unlike the underlying Session class, DSpaceSession can raise exceptions on initialization, if the username or password arguments are empty or if the server endpoint URL does not start with http:// or https://. It may also raise an exception on initialization if the initial authentication request to /api/authn/login fails with a non-200 status code.

Also unlike the underlying Session class, DSpaceSession will raise exceptions when making requests if the authentication token is expired and the request to /api/authn/login to refresh the token fails with a non-200 status code.

Since the session knows the server endpoint, you can make requests to URLs like "/actuator/health" or "/api/core/communities", and the server endpoint will be automatically prepended.

Examples

Simple GET to /actuator/info

import pprint
import dspace_requests_wrapper

s = dspace_requests_wrapper.DSpaceSession("https://your.dspace.here/server", "auserhere", "hunter42")

# Make a GET request to https://your.dspace.here/server/actuator/info with valid CSRF and Authentication headers:
pprint.pprint(s.get("/actuator/info").json())

Perform a search

import dspace_requests_wrapper

s = dspace_requests_wrapper.DSpaceSession("https://your.dspace.here/server", "auserhere", "hunter42")

search_url = "/api/discover/search/objects?query=spiders"
while True:
    response = s.get(search_url) # If the URL starts with /, the server endpoint is appended
    response.raise_for_status()
    results = response.json()
    for result in results["_embedded"]["searchResult"]["_embedded"]["objects"]:
        print(result["hitHighlights"])
        print(result["_links"]["indexableObject"]["href"])
    if "next" in results["_embedded"]["searchResult"]["_links"]:
        search_url = results["_embedded"]["searchResult"]["_links"]["next"]["href"]
    else:
        break

Create an item, add the ORIGINAL bundle and upload a bitstream to that bundle

import json
import pprint
import dspace_requests_wrapper

s = dspace_requests_wrapper.DSpaceSession("https://your.dspace.here/server", "auserhere", "hunter42")

# Create the item
example_item = {
    "name": "Practices of research data curation in institutional repositories: A qualitative view from repository staff",
    "metadata": {
        "dc.contributor.author": [
            {
                "value": "Stvilia, Besiki",
                "language": "en",
                "authority": None,
                "confidence": -1,
            }
        ],
        "dc.title": [
            {
                "value": "Practices of research data curation in institutional repositories: A qualitative view from repository staff",
                "language": "en",
                "authority": None,
                "confidence": -1,
            }
        ],
        "dc.type": [
            {
                "value": "Journal Article",
                "language": "en",
                "authority": None,
                "confidence": -1,
            }
        ],
    },
    "inArchive": True,
    "discoverable": True,
    "withdrawn": False,
    "type": "item",
}
response = s.post(
    "/api/core/items?owningCollection=A_COLLECTION_UUID",
    json=example_item,
)
response.raise_for_status()
print("Response from API when creating the item:")
pprint.pprint(response.json())
print("")
item_uuid = response.json()["uuid"]

# GET the item metadata
response = s.get("/api/core/items/" + item_uuid)
response.raise_for_status()
print("Item JSON data from API:")
pprint.pprint(response.json())
print("")

# Create the bundle
response = s.post(
    "/api/core/items/" + item_uuid + "/bundles",
    json={"name": "ORIGINAL"},
)
response.raise_for_status()
print("Response from API when creating the bundle:")
pprint.pprint(response.json())
print("")
bundle_uuid = response.json()["uuid"]

# GET the bundle metadata
response = s.get("/api/core/bundles/" + bundle_uuid)
response.raise_for_status()
print("Bundle JSON data from API:")
pprint.pprint(response.json())
print("")

# Add a bitstream to the bundle
example_pdf_filepath = "/tmp/example.pdf"
example_pdf_metadata = {
    # The name is optional.
    "name": "example_file_name_in_metadata.pdf",
    # The metadata here is optional as well.
    "metadata": {
        "dc.description": [
            {
                "value": "example file",
                "language": None,
                "authority": None,
                "confidence": -1,
                "place": 0,
            }
        ]
    },
}
# Open the file in binary mode with "rb".
with open(example_pdf_filepath, "rb") as example_pdf:
    # You can just use the open file handle as the value of file. The literal filename on disk will be used.
    # bitstream_data = {"file": example_pdf}
    # You can include the file name and mime type in a tuple. The mimetype doesn't matter, DSpace will assign a format using the extension and the format registry.
    # bitstream_data = {"file": ("example_name_here.pdf", example_pdf, "application/msword")} # application/msword is ignored
    # You can also add bitstream properties as JSON under the properties key.
    # bitstream_data = {"file": example_pdf, "properties": (None, json.dumps(example_pdf_metadata), "application/json")}
    # The filename in the properties 'wins', it will be used instead of the supplied filename in the tuple under the 'file' key.
    # bitstream_data = {"file": ("example_name_here_will_not_be_used.pdf", example_pdf), "properties": (None, json.dumps(example_pdf_metadata), "application/json")}
    # It should be possible to ensure the correct MD5 hash and Content Length by adding headers
    # to that part of the form data, at least according to the RestContract documentation.
    # In practice, we haven't been able to get this working. Incorrect headers here do not trigger a 412 HTTP error.
    # Instead, the JSON response has a "checkSum" field which we check against file hashes manually after upload, which we don't include here in this example.
    # bitstream_data = {"file": ("example.pdf", example_pdf, "application/pdf", {"Content-MD5": "MD5-HERE", "Content-Length": "100"}")}

    # For this example, let's use a simple file handle for the 'file' key and add the properties under 'properties'.
    bitstream_data = {
        "file": example_pdf,
        "properties": (None, json.dumps(example_pdf_metadata), "application/json"),
    }

    response = s.post(
        "/api/core/bundles/" + bundle_uuid + "/bitstreams",
        files=bitstream_data,
    )
    print("Response from API when creating the bitstream:")
    pprint.pprint(response.json())
    print("")
    bitstream_uuid = response.json()["uuid"]

# GET the bitstream metadata
response = s.get("/api/core/bitstreams/" + bitstream_uuid)
response.raise_for_status()
print("Bitstream JSON data from API:")
pprint.pprint(response.json())

Large bitstream uploads using chunked encoding

import json
import pprint
import dspace_requests_wrapper
from requests_toolbelt.multipart import encoder

s = dspace_requests_wrapper.DSpaceSession("https://your.dspace.here/server", "auserhere", "hunter42")

# Let's say this is a large file, so large that it should not be held in memory by requests.
large_file_filepath = "/tmp/bigfile.mp4"
# We can use requests_toolbelt.multipart to chunk the request.
# This allows us to continue to use the properties field.
large_file_metadata = {
    "metadata": {
        "dc.description": [
            {
                "value": "example large file",
                "language": None,
                "authority": None,
                "confidence": -1,
                "place": 0,
            }
        ]
    },
}

with open(large_file_filepath, "rb") as large_file:
    files = {
        # "file": large_file, # This does NOT work! You MUST use a tuple here with the filename.
        "file": ("bigfile.mp4", large_file),
        "properties": (None, json.dumps(large_file_metadata), "application/json"),
    }

    encoder = encoder.MultipartEncoder(files)
    # You can monitor the upload by using the MultipartEncoderMonitor instead of the encoder as the data
    # parameter value.
    #monitor = encoder.MultipartEncoderMonitor(e, lambda a: print(a.bytes_read, end="\r"))

    response = s.post(
        "/api/core/bundles/A_BUNDLE_UUID/bitstreams",
        data=encoder,
        #data=monitor,
        headers={"Content-Type": e.content_type},
    )

    # A possible workaround (?) for the hardcoded read size mentioned here:
    # https://toolbelt.readthedocs.io/en/latest/uploading-data.html#requests_toolbelt.multipart.encoder.MultipartEncoder
    # In practice, our upload speed was the same with or without using the generator. YMMV

    # A generator function, which yields 16384 byte chunks of the underlying file.
    #def gen():
    #    a = e.read(16384)
    #    while a:
    #        yield a
    #        a = e.read(16384)
    #
    #response = s.post(
    #    "/api/core/bundles/A_BUNDLE_UUID/bitstreams",
    #    data=gen(),
    #    headers={"Content-Type": e.content_type},
    #)
    response.raise_for_status()
    print("Response from API when creating the bitstream:")
    pprint.pprint(response.json())
    print("")
    bitstream_uuid = response.json()["uuid"]

response = s.get("/api/core/bitstreams/" + bitstream_uuid)
response.raise_for_status()
print("Bitstream JSON data from API:")
pprint.pprint(response.json())

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.3

Jul 17, 2025

This version

1.0.2

Jul 17, 2025

1.0.0

Jul 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dspace_requests_wrapper-1.0.2.tar.gz (24.4 kB view details)

Uploaded Jul 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dspace_requests_wrapper-1.0.2-py3-none-any.whl (7.7 kB view details)

Uploaded Jul 17, 2025 Python 3

File details

Details for the file dspace_requests_wrapper-1.0.2.tar.gz.

File metadata

Download URL: dspace_requests_wrapper-1.0.2.tar.gz
Upload date: Jul 17, 2025
Size: 24.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.21

File hashes

Hashes for dspace_requests_wrapper-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`357c23f734edcd4984b2b0f932e4d80e34ee57a3376ffe147498d915dc60d8ea`
MD5	`e0eca56aba5cdb367ad2259e1270d2c8`
BLAKE2b-256	`9e780cc39e2e6a34a8b490e598112cc11667bdd635655232bc70f5b846653606`

See more details on using hashes here.

File details

Details for the file dspace_requests_wrapper-1.0.2-py3-none-any.whl.

File metadata

Download URL: dspace_requests_wrapper-1.0.2-py3-none-any.whl
Upload date: Jul 17, 2025
Size: 7.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.21

File hashes

Hashes for dspace_requests_wrapper-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e64b0b89441fdc8554f8fe6d3d9f4e0f5646640f3cda51314311f3203af43f18`
MD5	`4c9b304edd62ea1cc2004923c0e1c81f`
BLAKE2b-256	`d3f21b9921efd1baac5aa68e959a6bbbe5ee83ea24a8b79008868d1ee140c014`

See more details on using hashes here.

dspace-requests-wrapper 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

DSpace Requests Wrapper

Examples

Simple GET to /actuator/info

Perform a search

Create an item, add the ORIGINAL bundle and upload a bitstream to that bundle

Large bitstream uploads using chunked encoding

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes