CluedIn Python SDK
Project description
CluedIn
cluedin is a Python SDK for CluedIn API.
Installation
From PyPi:
pip install cluedin
Quick start
CluedIn context configuration
Create a JSON file with context configuration to your CluedIn instance:
In this file, parameters have the following meaning:
protocol-httpif your CluedIn instance is not secured with a TLS certificate. Otherwise,httpsby default.domain– CluedIn instance domain without the Organization prefix.org_name– the name of Organization (a.k.a. Organization prefix).user_email– the user's email.user_password– the user's password.verify_tls–false, if an unknown CA signs the TLS certificate. Otherwise,trueby default.
Here is an example of a file for a CluedIn instance running locally from a Home repository:
{
"domain": "mdm.saas-cluedin.com",
"org_name": "foobar",
"user_email": "admin@foobar.com",
"user_password": "Foobar23!"
}
We add the protocol, but we can skip this parameter if the URL starts with https.
If you use self-signed certificates, you can add verify_tls: false to avoid certificate verification.
Alternatively, to provide email and password, you can obtain an API access token from CluedIn UI and provide it in the file:
{
"domain": "mdm.saas-cluedin.com",
"org_name": "foobar",
"access_token": "..."
}
When the configuration file exists, you can export its path to an environment variable:
export CLUEDIN_CONTEXT=~/.cluedin/home.json
Now, you can load this file from your Python code and get an access token (if not already provided):
import cluedin
context = Context.from_json_file(os.environ['CLUEDIN_CONTEXT'])
context.get_token() # call it only if access_token is not provided in the context file
You could also do it without the context file:
context = {
"domain": "mdm.saas-cluedin.com",
"org_name": "foobar",
"user_email": "admin@foobar.com",
"user_password": "Foobar23!"
}
context = Context.from_dict(context)
context.get_token()
Or, you can infer the context from the JWT token:
context = Context.from_jwt(API_TOKEN)
GraphQL
Get entities:
context = Context.from_json_file(os.environ['CLUEDIN_CONTEXT'])
context.get_token()
query = """
query searchEntities($cursor: PagingCursor, $query: String, $pageSize: Int) {
search(
query: $query,
sort: FIELDS,
cursor: $cursor,
pageSize: $pageSize
sortFields: {field: "id", direction: ASCENDING}
) {
totalResults
cursor
entries {
id
name
entityType
}
}
}
"""
variables = {
"query": "*",
"pageSize": 10_000
}
# it's important to request cursor in your GraphQL query,
# so cluedin.gql.entries would be able to request and return all pages
entities = cluedin.gql.entries(context, query, variables):
API
Environment
CLUEDIN_REQUEST_TIMEOUT_IN_SECONDS- CluedIn API request timeout (in seconds). If not set, then it defaults to300(5 minutes).
Context
cluedin.Context.from_dict(cls, context_dict: dict) -> Context– creates a newContextobject from adict.cluedin.Context.from_json_file(file_path: str) -> Context– creates a newContextobject from a JSON-file.cluedin.Context.from_jwt(jwt: str) -> Context– creates a newContextobject from a JWT (JSON Web Token, a.k.a. access token or API token).
Account
cluedin.account.get_users(context: Context, org_id: str = None) -> list– returns all users for Organization.cluedin.account.is_organization_available_response(context: Context, org_name: str) -> dict– checks if a given Organization name is available. This method returns a JSON-response serialized into adict.cluedin.account.is_organization_available(context: Context, org_name: str) -> bool– checks if a given Organization name is available. Returns a Boolean.cluedin.account.is_user_available_response(context: Context, user_email: str, org_name: str) -> dict– checks, if a user with a given email can be created or this email is already reserved. This method returns a JSON-response serialized into adict.cluedin.account.is_user_available(context: Context, user_email: str, org_name: str) -> bool– checks, if a user with a given email can be created or this email is already reserved. This method returns a JSON-response serialized into adict. Returns a Boolean.cluedin.account.get_invitation_code(context: Context, email: str) -> str– returns an invitation code for a given email.cluedin.account.create_organization(context: Context, user_email: str, password: str, org_name: str, org_sub_domain: str = None, email_domain: str = None, allow_email_domain_signup: bool = True, new_account_access_key: str = None) -> dict- creates a new Organization. This method returns a JSON-response serialized into adict.cluedin.account.create_user(context: Context, user_email: str, user_password: str) -> requests.models.Response– creates a new user. This method returnsrequests.models.Response.cluedin.account.create_admin_user(context: Context, user_email: str, user_password: str) -> requests.models.Response– creates a new admin user. This method returnsrequests.models.Response.cluedin.account.get_user(context: Context, user_id: str = None) -> dict– returns a user by ID. Ifuser_idis nor provided, the current user is returned. This method returns a JSON-response serialized into adict.
Entity
cluedin.entity.get_entity_blob(context: Context, entity_id: str) -> str– returns an entity blob by ID.cluedin.entity.get_entity_as_clue(context: Context, entity_id: str) -> str– returns an entity as a clue by ID.
Ingestion
cluedin.ingestion.post(context: Context, url: str, collection: list[Any], batch_size: int = 10_000, delay_in_seconds: int = 0) -> Generator– posts data to CluedIn ingestion endpoint. This method splits the collection into batches and sends them to CluedIn. Ifdelay_in_secondsis set, then it waits for this time before sending the next batch. Returns a generator of responses.
GraphQL
cluedin.gql.gql(context: Context, query: str, variables: dict = None) -> dict– sends a GraphQL request and returns a response.cluedin.gql.org_gql(context: Context, query: str, variables: dict = None) -> dict– sends a GraphQL request to Organization endpoint and returns a response.cluedin.gql.entries(context: Context, query: str, variables: dict = None, flat=False) -> Generator– returns entries from a GraphQL search query. If cursor is requested in the GraphQL query (see the example above and tests), then it proceeds to next pages to return all results. IfflatisTrue, then it flattens thepropertiesdictionary of each returned entity.search(context: Context, search_query: str, page_size: int = 10_000) -> Generator– returns entities by a search query. This method is a wrapper aroundcluedin.gql.entries.
JSON
cluedin.json.dump(file: str, obj: Any) -> None– serialize obj as a JSON formatted stream to file.cluedin.json.load(file: str) -> Any– deserialize file to a Python object.
JWT
cluedin.jwt.get_jwt_payload(jwt: str) -> dict– parses a JWT (JSON Web Token, a.k.a. access token or API token), and returns its payload serialized into adict.
Public API
cluedin.public.post_clue(context: Context, clue: str, content_type: str = 'application/xml') -> str– posts a clue in XML or JSON format. This method returns an operation result as a string.cluedin.public.restore_user_entities(context: Context) -> list– if you accidentally deleted/Infrastructure/Userentities, this method gets all users and restores entities for those who miss them.
Rules
cluedin.rules.RuleScope- an enumeration of rule scopes:DATA_PART,ENTITY,SURVIVORSHIP.cluedin.rules.get_rules(context: Context, scope=RuleScope.DATA_PART) -> dict– returns all rules for a given scope. This method returns a JSON-response serialized into adict.cluedin.rules.get_rule(context: Context, rule_id: str) -> dict– returns a rule by ID. This method returns a JSON-response serialized into adict.
Evaluator
-
cluedin.rules.evaluator.default_get_property_name(field: str) -> str– returns a default property name for a given field. Used to map CluedIn Rules fields to your fields. -
cluedin.rules.evaluator.default_get_value(field: str, obj: dict) -> Any– returns a default value for a given field. Used to map CluedIn Rules fields to your fields. -
cluedin.rules.Evaluator– a class to evaluate CluedIn Rules. -
cluedin.rules.Evaluator.evaluate(context: Context, rule: dict, obj: dict) -> bool– evaluates a rule for an object. Returns a Boolean:cluedin.rules.get_matching_objects(self, objects) -> list– returns a list of objects that match the rule.cluedin.rules.object_matches_rules(self, obj) -> bool– returnsTrueif an object matches the rule.cluedin.rules.explain(self) -> str– returns an explanation of the rule (in pandasDataFrame.queryterms).
Operators
cluedin.rules.operators.default_get_operator(operator_id) -> Any– returns a default operator for a given operator ID. Used to map CluedIn Rules operators to your operators.
You can add custom operations (see test_operators.py for examples), but the following CluedIn Rules operators are supported out of the box:
Is Not TrueIs TrueBegins WithBetweenContainsEnds WithEqualsExistsGreaterGreater or EqualInIs FalseIs Not NullIs NullIs TrueLessLess or EqualMatches patternNot Begins WithNot BetweenNot ContainsNot Ends WithNot EqualDoes Not ExistNot InDoes not match pattern
Vocabulary
cluedin.vocab.get_vocab_keys(context: Context) -> list– gets all vocabulary keys.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cluedin-3.0.0.tar.gz.
File metadata
- Download URL: cluedin-3.0.0.tar.gz
- Upload date:
- Size: 19.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.13.1 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b1e1f529a70022940ab00533111481d7aa6bc4cb3a312b218a4f1e4d6790e51
|
|
| MD5 |
d99c0c34afc0034f03969658dbb19c41
|
|
| BLAKE2b-256 |
490d0a86cba76a9056b76f947aa8179b2cc752212fdfce608ef4fbb342fcb292
|
File details
Details for the file cluedin-3.0.0-py3-none-any.whl.
File metadata
- Download URL: cluedin-3.0.0-py3-none-any.whl
- Upload date:
- Size: 24.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.13.1 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e38c8a80b217a96b9ef45d2ffc36a5283c69bb7b694dc3c50a05afac82b1fc9f
|
|
| MD5 |
b3c3512d648f393b05704b515452416d
|
|
| BLAKE2b-256 |
fc5f57799a2254d6a351d8088f5dd9f95cf1dd6364ece57e6faa060e3a337e0b
|