Skip to main content

Carol Python API and Tools

Project description

PyCarol

PyCarol is a Python SDK designed to support data ingestion and data access workflows on Carol. It provides abstractions for authentication, connector and staging management, data ingestion, and querying, enabling reliable integration with Carol services using Python. The SDK encapsulates low-level API communication and authentication logic, making data pipelines easier to build, maintain, and operate.

Table of Contents

Getting Started

Run pip install pycarol to install the latest stable version from PyPI. Documentation is hosted on Read the Docs.

Explicit authentication methods

Carol is the main object to access pyCarol and Carol APIs.

Using user/password

from pycarol import PwdAuth, Carol

carol = Carol(
    domain=TENANT_NAME,
    app_name=APP_NAME,
    auth=PwdAuth(USERNAME, PASSWORD),
    organization=ORGANIZATION
)

Using Tokens

from pycarol import PwdKeyAuth, Carol

carol = Carol(
    domain=TENANT_NAME,
    app_name=APP_NAME,
    auth=PwdKeyAuth(pwd_auth_token),
    organization=ORGANIZATION
)

Using API Key

from pycarol import ApiKeyAuth, Carol

carol = Carol(
    domain=DOMAIN,
    app_name=APP_NAME,
    auth=ApiKeyAuth(api_key=X_AUTH_KEY),
    connector_id=CONNECTORID,
    organization=ORGANIZATION
)

Setting up Carol entities

from pycarol import Connectors

connector_id = Connectors(carol).create(
    name="my_connector",
    label="connector_label"
)

Sending Data

from pycarol import Staging

Staging(carol).send_data(
    staging_name="my_stag",
    data=[{"name": "Rafael"}],
    connector_id=CONNECTORID
)

Staging batch API: Batch ingestion

To group multiple send_data() calls under one batch (e.g. for Carol to process as a unit), use start_batch() and end_batch(). Each request is tagged with batchId and batchIdSequence. If you do not start a batch explicitly, a batch is auto-started and auto-ended around a single send_data() call.

  • ``Staging.start_batch()``: Starts a batch, generates a batchId, returns it.

  • ``Staging.end_batch()``: Sends the batch summary to Carol and clears the current batch.

  • ``Staging.send_data()``: When a batch is active, appends batchId and batchIdSequence to the intake URL.

from pycarol import Carol, Staging
from dotenv import load_dotenv
load_dotenv()

carol = Carol()
json_ex = [
    {"name": "Rafael", "email": {"type": "email", "email": "rafael@totvs.com.br"}},
    {"name": "Leandro", "email": {"type": "email", "email": "Leandro@totvs.com.br"}},
]
staging = Staging(carol)

# Single send_data: batch is generated internally
staging.send_data(staging_name="test_batch", data=json_ex, step_size=1,
                  connector_id=CONNECTORID, print_stats=True)

# User-managed batch for multiple intake calls
staging.start_batch()
staging.send_data(staging_name="test_batch", data=json_ex, step_size=1,
                  connector_id=CONNECTORID, print_stats=True)
staging.send_data(staging_name="test_batch", data=json_ex, step_size=4,
                  connector_id=CONNECTORID, print_stats=True)
staging.end_batch()

Reading data

from pycarol import BQ, Carol

BQ(Carol()).query("SELECT * FROM stg_connectorname_tablename")

Carol In Memory

PyCarol provides an easy way to work with in-memory data using the Memory class, built on top of DuckDB. Queries are executed locally over in-memory data, without triggering BigQuery jobs or consuming BigQuery slots, and results are returned as pandas DataFrames. The recommended usage is with BQStorage objects.

from pycarol import Carol, Memory, BQStorage
from dotenv import load_dotenv

load_dotenv()
carol = Carol()

storage = BQStorage(carol)
memory = Memory()

t = storage.query(
    "ingestion_stg_connectorname_tablename",
    column_names=["tenantid", "processing", "_ingestionDatetime"],
    max_stream_count=50
)
memory.add("my_table", t)

table = memory.query("SELECT * FROM my_table")
print(table)

The syntax of Carol In Memory follows DuckDB SQL Syntax.

Logging

Prerequisites

Set LONGTASKID when running locally.

Logging messages to Carol

import logging
from pycarol import CarolHandler, Carol

logger = logging.getLogger(__name__)
logger.addHandler(CarolHandler(Carol()))
logger.info("Hello Carol")

Notes

  • Logs are linked to long tasks

  • Console fallback when task ID is missing

Calling Carol APIs

In addition to the high-level abstractions provided by pyCarol, it is also possible to call Carol APIs directly when needed. This is useful for endpoints that are not yet covered by specific SDK methods.

carol.call_api(
    "v1/tenantApps/subscribe/carolApps/{carol_app_id}",
    method="POST"
)

Settings

from pycarol.apps import Apps
Apps(carol).get_settings(app_name="my_app")

Useful Functions

from pycarol.functions import track_tasks
track_tasks(carol, ["task1", "task2"])

Release process

  1. Open PR to main

  2. Merge after approval

  3. Update README if needed

Made with ❤ at TOTVS IDeIA

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycarol-2.56.13.tar.gz (120.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycarol-2.56.13-py3-none-any.whl (137.2 kB view details)

Uploaded Python 3

File details

Details for the file pycarol-2.56.13.tar.gz.

File metadata

  • Download URL: pycarol-2.56.13.tar.gz
  • Upload date:
  • Size: 120.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for pycarol-2.56.13.tar.gz
Algorithm Hash digest
SHA256 7b6024d0ee3f228217553acc6550ca17657b5843ffb741348fd747d18a19edf1
MD5 6579982552298db5c327091ab0de8741
BLAKE2b-256 535e859f83c6b285b3ecb102387bbf4f8bd840fa4be100f303dc54519e7d0b18

See more details on using hashes here.

File details

Details for the file pycarol-2.56.13-py3-none-any.whl.

File metadata

  • Download URL: pycarol-2.56.13-py3-none-any.whl
  • Upload date:
  • Size: 137.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for pycarol-2.56.13-py3-none-any.whl
Algorithm Hash digest
SHA256 9b8a768d06a4d07a39c7c369051b34e41d622ea20db349a9781bbfe298b93c55
MD5 15a4a98cbea53e27efd847a2e99c2efd
BLAKE2b-256 1ecbfabbfa71b47c5e7f8a4ae83f1b0c667570540f05cf832133a24fda61dbcb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page