Skip to main content

No project description provided

Project description

NucliaDB SDK

The NucliaDB SDK is a Python library designed as a thin wrapper around the NucliaDB HTTP API. It is tailored for developers who wish to create low-level scripts to interact with NucliaDB.

WARNING

:warning: If it's your first time using Nuclia or you want a simple way to push your unstructured data to Nuclia with a script or a CLI, we highly recommend using the Nuclia CLI/SDK instead, as it is much more user-friendly and use-case focused. :warning:

Installation

To install it, simply with pip:

pip install nucliadb-sdk

How to use it?

You can find the auto-generated documentation of the NucliaDB sdk here.

Essentially, each method of the NucliaDB class maps to an HTTP endpoint of the NucliaDB API. The parameters it accepts correspond to the Pydantic models associated to the request body scheme of the endpoint.

The method-to-endpoint mappings for the sdk are declared in-code in the _NucliaDBBase class.

For instance, to create a resource in your Knowledge Box, the endpoint is defined here.

It has a {kbid} path parameter and is expecting a json payload with some optional keys like slug or title, that are of type string. With curl, the command would be:

curl -XPOST http://localhost:8080/api/v1/kb/my-kbid/resources -H 'x-nucliadb-roles: WRITER' --data-binary '{"slug":"my-resource","title":"My Resource"}' -H "Content-Type: application/json"
{"uuid":"fbdb10a79abc45c0b13400f5697ea2ba","seqid":1}

and with the NucliaDB sdk:

>>> from nucliadb_sdk import NucliaDB
>>>
>>> ndb = NucliaDB(region="on-prem", url="http://localhost:8080/api")
>>> ndb.create_resource(kbid="my-kbid", slug="my-resource", title="My Resource")
ResourceCreated(uuid='fbdb10a79abc45c0b13400f5697ea2ba', elapsed=None, seqid=1)

Note that paths parameters are mapped as required keyword arguments of the NucliaDB class methods: hence the kbid="my-kbid". Any other keyword arguments specified in the method will be sent along in the json request body of the HTTP request.

Alternatively, you can also define the content parameter and pass an instance of the Pydantic model that the endpoint expects:

>>> from nucliadb_sdk import NucliaDB
>>> from nucliadb_models.writer import CreateResourcePayload
>>> 
>>> ndb = NucliaDB(region="on-prem", url="http://localhost:8080/api")
>>> content = CreateResourcePayload(slug="my-resource", title="My Resource")
>>> ndb.create_resource(kbid="my-kbid", content=content)
ResourceCreated(uuid='fbdb10a79abc45c0b13400f5697ea2ba', elapsed=None, seqid=1)

Query parameters can be passed too on each method with the query_params argument. For instance:

>>> ndb.get_resource_by_id(kbid="my-kbid", rid="rid", query_params={"show": ["values"]})

Example Usage

The following is a sample script that fetches the HTML of a website, extracts all links that it finds on it and pushes them to NucliaDB so that they get processed by Nuclia's processing engine.

from nucliadb_models.link import LinkField
from nucliadb_models.writer import CreateResourcePayload
import nucliadb_sdk
import requests
from bs4 import BeautifulSoup


def extract_links_from_url(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")
    unique_links = set()
    for link in soup.find_all("a"):
        unique_links.add(link.get("href"))
    return unique_links


def upload_link_to_nuclia(ndb, *, kbid, link, tags):
    try:
        title = link.replace("-", " ")
        slug = "-".join(tags) + "-" + link.split("/")[-1]
        content = CreateResourcePayload(
            title=title,
            slug=slug,
            links={
                "link": LinkField(
                    uri=link,
                    language="en",
                )
            },
        )
        ndb.create_resource(kbid=kbid, content=content)
        print(f"Resource created from {link}. Title={title} Slug={slug}")
    except nucliadb_sdk.exceptions.ConflictError:
        print(f"Resource already exists: {link} {slug}")
    except Exception as ex:
        print(f"Failed to create resource: {link} {slug}: {ex}")


def main(site):
    # Define the NucliaDB instance with region and URL
    ndb = nucliadb_sdk.NucliaDB(region="on-prem", url="http://localhost:8080")

    # Loop through extracted links and upload to NucliaDB
    for link in extract_links_from_url(site):
        upload_link_to_nuclia(ndb, kbid="my-kb-id", link=link, tags=["news"])

if __name__ == "__main__":
    main(site="https://en.wikipedia.org/wiki/The_Lion_King")

After the data is pushed, the NucliaDB SDK could also be used to find answers on top of the extracted links.

>>> import nucliadb_sdk
>>> 
>>> ndb = nucliadb_sdk.NucliaDB(region="on-prem", url="http://localhost:8080")
>>> resp = ndb.chat(kbid="my-kb-id", query="What does Hakuna Matata mean?")
>>> print(resp.answer)
'Hakuna matata is actually a phrase in the East African language of Swahili that literally means “no trouble” or “no problems”.'

Conclusion

Explore more features and capabilities of the NucliaDB SDK by referring to the official documentation. We welcome your feedback and contributions to make this SDK even better!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

nucliadb_sdk-2.43.0.post221-py3-none-any.whl (39.2 kB view details)

Uploaded Python 3

File details

Details for the file nucliadb_sdk-2.43.0.post221-py3-none-any.whl.

File metadata

File hashes

Hashes for nucliadb_sdk-2.43.0.post221-py3-none-any.whl
Algorithm Hash digest
SHA256 d157f8060c553baed1f10719915b0eb22dffebce29e88b9cd27cdd2c2ebe32cd
MD5 c4601e611b07fdf08cad49351f735919
BLAKE2b-256 193039b81224935381b0aeba259046344ed608164a1c4d8a09a0413d263e55e0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page