Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.125.0.tar.gz (1.7 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.125.0-cp313-cp313-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.125.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.125.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.125.0-cp313-cp313-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.125.0-cp313-cp313-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.125.0-cp312-cp312-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.125.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.125.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.125.0-cp312-cp312-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.125.0-cp312-cp312-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.125.0-cp311-cp311-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.125.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.125.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.125.0-cp311-cp311-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.125.0-cp311-cp311-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.125.0-cp310-cp310-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.125.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.125.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.125.0-cp310-cp310-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.125.0-cp310-cp310-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.125.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.125.0.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.125.0.tar.gz
Algorithm Hash digest
SHA256 16a8fe6e9c944089a45f5d90afe7a4f811d301642724a771fdb57facd8e900a1
MD5 6ef7a523eac721d12e5a9c45f87e5785
BLAKE2b-256 86c72938e563d445087a8950ca6f4be1f3904d6018e661adc6cdf3f5f7b999a4

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.125.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.125.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 5b27e1478260ee1d23a682b10cc47d723fd2ec3f35b158a541d4d3fbe3603569
MD5 b62c7b500a2978904d1824090b5e20bd
BLAKE2b-256 79a1ed397ccff2e5831a2d02b852dfb0356b60c0b9140f3ec0c433ba3bfb636c

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3fcf6399efb7dd9943faf54ecd29c601ea7bc14e13d4e3a200c1bec499b58d31
MD5 7731472c1adb808293490125f58ba64a
BLAKE2b-256 8e4332063a3e4538dfa1497797b4bf13e14d8603eacf53ca0a2f1eb9f7b1d4e1

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 af87386eb5449e2f3ba9f64a77a58c08a4ca3121508d808a8ee3ea2e8d0137bd
MD5 abc5759b12991b6a9631fca22c3ba5f2
BLAKE2b-256 bc267da2ab4db84601cc73adb623555a0e790212b170427e1e2e3fde19e329ef

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dfdbbfb225639bfff772500ca14052972c626caedcaeff8211455102511bf94b
MD5 348589d5abc852c9a467f15769e49b6f
BLAKE2b-256 2680180baa6e9108c118178f873214cf663f9ea238378498678cc74115560210

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 198bf0ba4b37d64efe6c6271b35964b6345b51729158ce53497abe7aad167525
MD5 e0b91c8a3540b2b42773dd9f442058cb
BLAKE2b-256 e8aa4f4c0da11528c0142609feb09e1237e3a75b180d6b014d864bb4c7a5ece8

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.125.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.125.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 e7051381de872b39dab4c585e01b4b1fffdb11d069bf8d699d0cafacace7c873
MD5 70d8945d2a4bfa31131172817656afd8
BLAKE2b-256 1173d581fbca008660e273017cb97d77aeeb51bec5f2880dc4605f7aaa17b22c

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 25f23278cf2d41f6d8ddffcd83ee67f4eac236664fd28901bb372edabb150579
MD5 0cdb01d078f44c2bd7feee9f51893717
BLAKE2b-256 22e2c0a3f8e66b53d17dec613457fe9c9da0bda9dfde858904c6ab04a63ae4c8

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 2f81173cdd98a4dc3ae5f11d518cdb3e53cf4cb6c3a619c7cd1ed63f84e36a5e
MD5 b8712681209477590485be8cb6d21410
BLAKE2b-256 e1f7c3ca171f19f080794554582c8774b3c252d741e76a6c3fc13d30def67ba8

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9a119e3db9433fa3375e3a355662b3bd1dd8aa43a331547c68958a8e71788bb6
MD5 ea82059e0d295bfd5b5ea5aa731b3c27
BLAKE2b-256 9ec0ed50355ebd05103eccb6e0c9f77fb13c4f58c7c9961bfeb4f8da36dba9e6

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 5671528aa24e8e0c6bd10b7965d109a16973c4cf678386db5ce158eb8bdd3699
MD5 c5e3e35c7a54f528b0865fbbe5fb33ca
BLAKE2b-256 d2adc57c44412a1d791c8ce010f6fc3a1963e835aae1bd680ecd9c308e26aeee

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.125.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.125.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 db8000c5ca78bfe543de301e178512844642db061a64585332b7d11e7774dcf1
MD5 ed886937c4dd4af337ee05897884bed7
BLAKE2b-256 205966a57befb605142591197ab1fc000785e3d7a2d76e2aed934fc6c6a271bd

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a0355e15fe9fa1655682d0ba502b5966b7614da129b9794be14aea06ff9b5df5
MD5 3c7e45006f51b5f4c55716393678d614
BLAKE2b-256 c1aa18c29eeab8b74b84dc897c6fe84d874262cab2b1caee9f4b317852ff992c

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7bf3299922a3b13b56d56b8f7d004201d0d67394df5c017e4d0383115844365d
MD5 313f8ffd24383814d238cb48f0f3f9e1
BLAKE2b-256 b3806b1e072c3f513a60a7e1aa52eb9f8525b93f11370951969ad044a21eb818

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a581ef8d28433a2349cf514f02037a9d0faca5ce7f02d6019cb4cddb01dadef5
MD5 d13200ecab04adcb78c48e2356e20440
BLAKE2b-256 6a16a0941c8c413206d8523fedc538a8f4fea4e27e7727785003e5a19cf2ac01

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f12c38bc9bb199dfc9747e034d4a4790d6019c9cec19d1cfd3e1cd91e08c6963
MD5 8b205fe426ad15ca3b1e7218c6ada094
BLAKE2b-256 07fc492c8461ccbeacebe8bf1d69e0b907a125c2b893cee09e04dbc328e9192d

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.125.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.125.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 bfc9ced833a19c86a357e50cc17218d724744905ffbd3f038657df85befe54bf
MD5 510f7dbbee6e8eec1d233778e4076f6a
BLAKE2b-256 94572b28ca0cd95ac71eab816bfb9d638b6ae0f25e91746349a717c0fdba9d9b

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d03afe6a6969373ebe36e957d3add5fe7778e83b8564cc77c7f92c1e00b8b1c4
MD5 0fac6d7d3878b2fd2247b4a6b1b37dcd
BLAKE2b-256 c29fcf8c4a44f86566e1f4ff1a0f0d5d8194da3b75fca502c783ddff9b5dda5d

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4882c4f5667af5e4cbcca17ffda7d20f3b3fbf941504557286a0617f1b035960
MD5 5de38a176fbd28359ab0b20548b9fff0
BLAKE2b-256 b3a99e606a981c3ae308318e52dad7c17736c8d3e94a3e20f7424b06e8b20c3a

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fcf7b70eb772b511e059e236e9b72e952c3c3068cc164521eb1bbaab2783c5df
MD5 c967f63cce19cf67a7c7d3e7b91e23f2
BLAKE2b-256 f3ec4d94c3b8ff0952968409798edeb1e207444db0a9f5d071fd88aacca885fa

See more details on using hashes here.

File details

Details for the file chalkpy-2.125.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.125.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 c7772fa35e4b1cdb26b0ab6be7e51f1b718aeb1c50010833332b1695f72fc1b7
MD5 76a58954cb752b2609a4b5f2cc5f3d5e
BLAKE2b-256 ce92b13739510f94742bcb8402d7118b419cc532495f0bbb32263716b969f468

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page