CluedIn Python SDK

These details have not been verified by PyPI

Project links

Project description

CluedIn

cluedin is a Python SDK for CluedIn API.

Installation

From PyPi:

pip install cluedin

Quick start

CluedIn context configuration

Create a JSON file with context configuration to your CluedIn instance:

In this file, parameters have the following meaning:

protocol - http if your CluedIn instance is not secured with a TLS certificate. Otherwise, https by default.
domain – CluedIn instance domain without the Organization prefix.
org_name – the name of Organization (a.k.a. Organization prefix).
user_email – the user's email.
user_password – the user's password.
verify_tls – false, if an unknown CA signs the TLS certificate. Otherwise, true by default.

Here is an example of a file for a CluedIn instance running locally from a Home repository:

{
  "domain": "mdm.saas-cluedin.com",
  "org_name": "foobar",
  "user_email": "admin@foobar.com",
  "user_password": "Foobar23!"
}

We add the protocol, but we can skip this parameter if the URL starts with https. If you use self-signed certificates, you can add verify_tls: false to avoid certificate verification.

Alternatively, to provide email and password, you can obtain an API access token from CluedIn UI and provide it in the file:

{
  "domain": "mdm.saas-cluedin.com",
  "org_name": "foobar",
  "access_token": "..."
}

When the configuration file exists, you can export its path to an environment variable:

export CLUEDIN_CONTEXT=~/.cluedin/home.json

Now, you can load this file from your Python code and get an access token (if not already provided):

import cluedin

context = Context.from_json_file(os.environ['CLUEDIN_CONTEXT'])
context.get_token() # call it only if access_token is not provided in the context file

You could also do it without the context file:

context = {
    "domain": "mdm.saas-cluedin.com",
    "org_name": "foobar",
    "user_email": "admin@foobar.com",
    "user_password": "Foobar23!"
}

context = Context.from_dict(context)
context.get_token()

Or, you can infer the context from the JWT token:

context = Context.from_jwt(API_TOKEN)

GraphQL

Get entities:

context = Context.from_json_file(os.environ['CLUEDIN_CONTEXT'])
context.get_token()

query = """
    query searchEntities($cursor: PagingCursor, $query: String, $pageSize: Int) {
      search(
        query: $query,
        sort: FIELDS,
        cursor: $cursor,
        pageSize: $pageSize
        sortFields: {field: "id", direction: ASCENDING}
      ) {
        totalResults
        cursor
        entries {
          id
          name
          entityType
        }
      }
    }
"""

variables = {
    "query": "*",
    "pageSize": 10_000
}

# it's important to request cursor in your GraphQL query,
# so cluedin.gql.entries would be able to request and return all pages
entities = cluedin.gql.entries(context, query, variables):

API

Environment

CLUEDIN_REQUEST_TIMEOUT_IN_SECONDS - CluedIn API request timeout (in seconds). If not set, then it defaults to 300 (5 minutes).

Context

cluedin.Context.from_dict(cls, context_dict: dict) -> Context – creates a new Context object from a dict.
cluedin.Context.from_json_file(file_path: str) -> Context – creates a new Context object from a JSON-file.
cluedin.Context.from_jwt(jwt: str) -> Context – creates a new Context object from a JWT (JSON Web Token, a.k.a. access token or API token).

Account

cluedin.account.get_users(context: Context, org_id: str = None) -> list – returns all users for Organization.
cluedin.account.is_organization_available_response(context: Context, org_name: str) -> dict – checks if a given Organization name is available. This method returns a JSON-response serialized into a dict.
cluedin.account.is_organization_available(context: Context, org_name: str) -> bool – checks if a given Organization name is available. Returns a Boolean.
cluedin.account.is_user_available_response(context: Context, user_email: str, org_name: str) -> dict – checks, if a user with a given email can be created or this email is already reserved. This method returns a JSON-response serialized into a dict.
cluedin.account.is_user_available(context: Context, user_email: str, org_name: str) -> bool – checks, if a user with a given email can be created or this email is already reserved. This method returns a JSON-response serialized into a dict. Returns a Boolean.
cluedin.account.get_invitation_code(context: Context, email: str) -> str – returns an invitation code for a given email.
cluedin.account.create_organization(context: Context, user_email: str, password: str, org_name: str, org_sub_domain: str = None, email_domain: str = None, allow_email_domain_signup: bool = True, new_account_access_key: str = None) -> dict - creates a new Organization. This method returns a JSON-response serialized into a dict.
cluedin.account.create_user(context: Context, user_email: str, user_password: str) -> requests.models.Response – creates a new user. This method returns requests.models.Response.
cluedin.account.create_admin_user(context: Context, user_email: str, user_password: str) -> requests.models.Response – creates a new admin user. This method returns requests.models.Response.
cluedin.account.get_user(context: Context, user_id: str = None) -> dict – returns a user by ID. If user_id is nor provided, the current user is returned. This method returns a JSON-response serialized into a dict.

Entity

cluedin.entity.get_entity_blob(context: Context, entity_id: str) -> str – returns an entity blob by ID.
cluedin.entity.get_entity_as_clue(context: Context, entity_id: str) -> str – returns an entity as a clue by ID.

Ingestion

cluedin.ingestion.post(context: Context, url: str, collection: list[Any], batch_size: int = 10_000, delay_in_seconds: int = 0) -> Generator – posts data to CluedIn ingestion endpoint. This method splits the collection into batches and sends them to CluedIn. If delay_in_seconds is set, then it waits for this time before sending the next batch. Returns a generator of responses.

GraphQL

cluedin.gql.gql(context: Context, query: str, variables: dict = None) -> dict – sends a GraphQL request and returns a response.
cluedin.gql.org_gql(context: Context, query: str, variables: dict = None) -> dict – sends a GraphQL request to Organization endpoint and returns a response.
cluedin.gql.entries(context: Context, query: str, variables: dict = None, flat=False) -> Generator – returns entries from a GraphQL search query. If cursor is requested in the GraphQL query (see the example above and tests), then it proceeds to next pages to return all results. If flat is True, then it flattens the properties dictionary of each returned entity.
search(context: Context, search_query: str, page_size: int = 10_000) -> Generator – returns entities by a search query. This method is a wrapper around cluedin.gql.entries.

JSON

cluedin.json.dump(file: str, obj: Any) -> None – serialize obj as a JSON formatted stream to file.
cluedin.json.load(file: str) -> Any – deserialize file to a Python object.

JWT

cluedin.jwt.get_jwt_payload(jwt: str) -> dict – parses a JWT (JSON Web Token, a.k.a. access token or API token), and returns its payload serialized into a dict.

Public API

cluedin.public.post_clue(context: Context, clue: str, content_type: str = 'application/xml') -> str – posts a clue in XML or JSON format. This method returns an operation result as a string.
cluedin.public.restore_user_entities(context: Context) -> list – if you accidentally deleted /Infrastructure/User entities, this method gets all users and restores entities for those who miss them.

Rules

cluedin.rules.RuleScope - an enumeration of rule scopes: DATA_PART, ENTITY, SURVIVORSHIP.
cluedin.rules.get_rules(context: Context, scope=RuleScope.DATA_PART) -> dict – returns all rules for a given scope. This method returns a JSON-response serialized into a dict.
cluedin.rules.get_rule(context: Context, rule_id: str) -> dict – returns a rule by ID. This method returns a JSON-response serialized into a dict.

Evaluator

cluedin.rules.evaluator.default_get_property_name(field: str) -> str – returns a default property name for a given field. Used to map CluedIn Rules fields to your fields.
cluedin.rules.evaluator.default_get_value(field: str, obj: dict) -> Any – returns a default value for a given field. Used to map CluedIn Rules fields to your fields.
cluedin.rules.Evaluator – a class to evaluate CluedIn Rules.
cluedin.rules.Evaluator.evaluate(context: Context, rule: dict, obj: dict) -> bool – evaluates a rule for an object. Returns a Boolean:
- cluedin.rules.get_matching_objects(self, objects) -> list – returns a list of objects that match the rule.
- cluedin.rules.object_matches_rules(self, obj) -> bool – returns True if an object matches the rule.
- cluedin.rules.explain(self) -> str – returns an explanation of the rule (in pandas DataFrame.query terms).

Operators

cluedin.rules.operators.default_get_operator(operator_id) -> Any – returns a default operator for a given operator ID. Used to map CluedIn Rules operators to your operators.

You can add custom operations (see test_operators.py for examples), but the following CluedIn Rules operators are supported out of the box:

Is Not True
Is True
Begins With
Between
Contains
Ends With
Equals
Exists
Greater
Greater or Equal
In
Is False
Is Not Null
Is Null
Is True
Less
Less or Equal
Matches pattern
Not Begins With
Not Between
Not Contains
Not Ends With
Not Equal
Does Not Exist
Not In
Does not match pattern

Vocabulary

cluedin.vocab.get_vocab_keys(context: Context) -> list – gets all vocabulary keys.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.0.0

Mar 9, 2025

3.0.0b3 pre-release

Mar 9, 2025

3.0.0b2 pre-release

Mar 9, 2025

3.0.0b1 pre-release

Mar 8, 2025

2.6.0

Dec 22, 2024

2.6.0b2 pre-release

Dec 22, 2024

2.6.0b1 pre-release

Dec 22, 2024

2.5.0

Aug 4, 2024

2.5.0b1 pre-release

Aug 4, 2024

2.4.0

Mar 3, 2024

2.4.0b5 pre-release

Feb 29, 2024

2.4.0b4 pre-release

Feb 29, 2024

2.4.0b3 pre-release

Feb 29, 2024

2.4.0b1 pre-release

Feb 25, 2024

2.3.0

Feb 5, 2024

2.2.0

Jun 27, 2023

2.1.0

May 21, 2023

2.1.0rc1 pre-release

May 21, 2023

2.0.0

May 14, 2023

2.0.0rc1 pre-release

May 14, 2023

1.0.0

May 7, 2023

0.2.2

May 7, 2023

0.2.1

May 7, 2023

0.2.0

May 7, 2023

0.1.1

May 5, 2023

0.1.0

May 4, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cluedin-3.0.0.tar.gz (19.3 kB view details)

Uploaded Mar 9, 2025 Source

Built Distribution

cluedin-3.0.0-py3-none-any.whl (24.6 kB view details)

Uploaded Mar 9, 2025 Python 3

File details

Details for the file cluedin-3.0.0.tar.gz.

File metadata

Download URL: cluedin-3.0.0.tar.gz
Upload date: Mar 9, 2025
Size: 19.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.5 CPython/3.13.1 Darwin/24.3.0

File hashes

Hashes for cluedin-3.0.0.tar.gz
Algorithm	Hash digest
SHA256	`2b1e1f529a70022940ab00533111481d7aa6bc4cb3a312b218a4f1e4d6790e51`
MD5	`d99c0c34afc0034f03969658dbb19c41`
BLAKE2b-256	`490d0a86cba76a9056b76f947aa8179b2cc752212fdfce608ef4fbb342fcb292`

See more details on using hashes here.

File details

Details for the file cluedin-3.0.0-py3-none-any.whl.

File metadata

Download URL: cluedin-3.0.0-py3-none-any.whl
Upload date: Mar 9, 2025
Size: 24.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.5 CPython/3.13.1 Darwin/24.3.0

File hashes

Hashes for cluedin-3.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e38c8a80b217a96b9ef45d2ffc36a5283c69bb7b694dc3c50a05afac82b1fc9f`
MD5	`b3c3512d648f393b05704b515452416d`
BLAKE2b-256	`fc5f57799a2254d6a351d8088f5dd9f95cf1dd6364ece57e6faa060e3a337e0b`

See more details on using hashes here.

cluedin 3.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CluedIn

Installation

Quick start

CluedIn context configuration

GraphQL

API

Environment

Context

Account

Entity

Ingestion

GraphQL

JSON

JWT

Public API

Rules

Evaluator

Operators

Vocabulary

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes