Skip to main content

Carol Python API and Tools

Project description

PyCarol

PyCarol is a Python SDK designed to support data ingestion and data access workflows on Carol. It provides abstractions for authentication, connector and staging management, data ingestion, and querying, enabling reliable integration with Carol services using Python. The SDK encapsulates low-level API communication and authentication logic, making data pipelines easier to build, maintain, and operate.

Table of Contents

Getting Started

Run pip install pycarol to install the latest stable version from PyPI. Documentation is hosted on Read the Docs.

Explicit authentication methods

Carol is the main object to access pyCarol and Carol APIs.

Using user/password

from pycarol import PwdAuth, Carol

carol = Carol(
    domain=TENANT_NAME,
    app_name=APP_NAME,
    auth=PwdAuth(USERNAME, PASSWORD),
    organization=ORGANIZATION
)

Using Tokens

from pycarol import PwdKeyAuth, Carol

carol = Carol(
    domain=TENANT_NAME,
    app_name=APP_NAME,
    auth=PwdKeyAuth(pwd_auth_token),
    organization=ORGANIZATION
)

Using API Key

from pycarol import ApiKeyAuth, Carol

carol = Carol(
    domain=DOMAIN,
    app_name=APP_NAME,
    auth=ApiKeyAuth(api_key=X_AUTH_KEY),
    connector_id=CONNECTORID,
    organization=ORGANIZATION
)

Setting up Carol entities

from pycarol import Connectors

connector_id = Connectors(carol).create(
    name="my_connector",
    label="connector_label"
)

Sending Data

from pycarol import Staging

Staging(carol).send_data(
    staging_name="my_stag",
    data=[{"name": "Rafael"}],
    connector_id=CONNECTORID
)

Staging batch API: Batch ingestion

To group multiple send_data() calls under one batch (e.g. for Carol to process as a unit), use start_batch() and end_batch(). Each request is tagged with batchId and batchIdSequence. If you do not start a batch explicitly, a batch is auto-started and auto-ended around a single send_data() call.

  • ``Staging.start_batch()``: Starts a batch, generates a batchId, returns it.

  • ``Staging.end_batch()``: Sends the batch summary to Carol and clears the current batch.

  • ``Staging.send_data()``: When a batch is active, appends batchId and batchIdSequence to the intake URL.

from pycarol import Carol, Staging
from dotenv import load_dotenv
load_dotenv()

carol = Carol()
json_ex = [
    {"name": "Rafael", "email": {"type": "email", "email": "rafael@totvs.com.br"}},
    {"name": "Leandro", "email": {"type": "email", "email": "Leandro@totvs.com.br"}},
]
staging = Staging(carol)

# Single send_data: batch is generated internally
staging.send_data(staging_name="test_batch", data=json_ex, step_size=1,
                  connector_id=CONNECTORID, print_stats=True)

# User-managed batch for multiple intake calls
staging.start_batch()
staging.send_data(staging_name="test_batch", data=json_ex, step_size=1,
                  connector_id=CONNECTORID, print_stats=True)
staging.send_data(staging_name="test_batch", data=json_ex, step_size=4,
                  connector_id=CONNECTORID, print_stats=True)
staging.end_batch()

Reading data

from pycarol import BQ, Carol

BQ(Carol()).query("SELECT * FROM stg_connectorname_tablename")

Carol In Memory

PyCarol provides an easy way to work with in-memory data using the Memory class, built on top of DuckDB. Queries are executed locally over in-memory data, without triggering BigQuery jobs or consuming BigQuery slots, and results are returned as pandas DataFrames. The recommended usage is with BQStorage objects.

On BQStorage you can optionally indicate the dataset by declaring dataset_id. If you don’t, it will default to Carol’s dataset.

from pycarol import Carol, Memory, BQStorage
from dotenv import load_dotenv

load_dotenv()
carol = Carol()

storage = BQStorage(carol)
memory = Memory()

t = storage.query(
    "ingestion_stg_connectorname_tablename",
    column_names=["tenantid", "processing", "_ingestionDatetime"],
    max_stream_count=50
)
memory.add("my_table", t)

table = memory.query("SELECT * FROM my_table")
print(table)

The syntax of Carol In Memory follows DuckDB SQL Syntax.

Logging

Prerequisites

Set LONGTASKID when running locally.

Logging messages to Carol

import logging
from pycarol import CarolHandler, Carol

logger = logging.getLogger(__name__)
logger.addHandler(CarolHandler(Carol()))
logger.info("Hello Carol")

Notes

  • Logs are linked to long tasks

  • Console fallback when task ID is missing

Calling Carol APIs

In addition to the high-level abstractions provided by pyCarol, it is also possible to call Carol APIs directly when needed. This is useful for endpoints that are not yet covered by specific SDK methods.

carol.call_api(
    "v1/tenantApps/subscribe/carolApps/{carol_app_id}",
    method="POST"
)

Settings

from pycarol.apps import Apps
Apps(carol).get_settings(app_name="my_app")

Useful Functions

from pycarol.functions import track_tasks
track_tasks(carol, ["task1", "task2"])

Release process

  1. Open PR to main

  2. Merge after approval

  3. Update README if needed

Made with ❤ at TOTVS IDeIA

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycarol-2.56.14.tar.gz (120.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycarol-2.56.14-py3-none-any.whl (137.6 kB view details)

Uploaded Python 3

File details

Details for the file pycarol-2.56.14.tar.gz.

File metadata

  • Download URL: pycarol-2.56.14.tar.gz
  • Upload date:
  • Size: 120.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for pycarol-2.56.14.tar.gz
Algorithm Hash digest
SHA256 2fae05322ae854f50600a1893e39b62b9acfad8b9d0ab9883d89c64bbfa8ee0d
MD5 4a376697157e2366c8be2b77426cfa90
BLAKE2b-256 6c8254a89ff9ca6657b4d861af863cf623ed09ece663a80d07b9756cf53ec644

See more details on using hashes here.

File details

Details for the file pycarol-2.56.14-py3-none-any.whl.

File metadata

  • Download URL: pycarol-2.56.14-py3-none-any.whl
  • Upload date:
  • Size: 137.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for pycarol-2.56.14-py3-none-any.whl
Algorithm Hash digest
SHA256 a400b7a326eafe6319b9735c82671ae59ff36b45aa1aa5ec86530e2799c942f6
MD5 89d65688a0dc501210c20da2b4d01cc0
BLAKE2b-256 267b36bd72b543d73718d9886194d8e70f53226627b9694d2757f69c5429ef92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page