Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.119.0.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.119.0-cp313-cp313-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.119.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.119.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.119.0-cp313-cp313-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.119.0-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.119.0-cp312-cp312-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.119.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.119.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.119.0-cp312-cp312-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.119.0-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.119.0-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.119.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.119.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.119.0-cp311-cp311-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.119.0-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.119.0-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.119.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.119.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.119.0-cp310-cp310-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.119.0-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.119.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.119.0.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.0.tar.gz
Algorithm Hash digest
SHA256 a60a0c154ac8780ffcca9259d05596e48982fcfa9ca0878438876b4fdfd39b35
MD5 c230c18e01e100051ab7d7adbae97508
BLAKE2b-256 b4e62c0ddc08026236e45f56c35065a426cf8fa30b7f89766ae40ac106706f37

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 c3bd70644022b3ef49dc7e651e828d2b4fc1bbbda1be2d4b30439ac0b3488626
MD5 6da85853ef1094f3ffbd2dfff7cf42fb
BLAKE2b-256 83a0b097b9683d748167cf1748b4ee60d0c403aa5ada889a8f901a4b7eed7c58

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d2eccc962e6aac613300ab7d1cd38241dc00bf84cb1b1762d2399b60a180aebc
MD5 2dc59a94d3e4b743f9c1acd837fc9f8e
BLAKE2b-256 14f4d385c5aa8e10904bd668fd0f2fd330a64132931cf7622e9fc86f7fdd3a97

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fa433f2db10e39f417758910d9270b504b0863f6bb8c5f45ccd6502b27235eb3
MD5 adfcd98694d889f48416902fafd5f449
BLAKE2b-256 486653b66c652945491c8647c5e2d520e4550382a9a955b627507e87794c56c3

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3c046ad7ce73f3aaf510ea5b91668b9767e8efbe50660644058acfb9abe6faa0
MD5 6c20971c78a57a71b412cd6ebc993368
BLAKE2b-256 cf6ea51d5a5bdc0975a5b4605351341c8c99983689f62a65912c983bf161aa63

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9cf595f2abf6910cf38b817ce827a36fe5045046cfbd131be94150ae2e836324
MD5 33b6c002b62486d0a1979996c187c0bf
BLAKE2b-256 e60c584496081fa87caef0201028d867eec0d0ed95da96a9a85b4344dd7fb1bf

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a2d223a174b82a17df28335e4946bec5679ca8cdb78a9debd11fb1e7967ade0c
MD5 7156ab37790c9e82d0dc7741a4917605
BLAKE2b-256 a7e876369f4d8ec1368be3ad9ad09d6237ba30be2fd5f0727e2f88808f91c27c

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 09b96a2dce6b9c1ba3d424aa7a5b5a07706397941ae994828d80d5769d1e4fac
MD5 2f0aaa032ea84603d356c3bab56a8d6f
BLAKE2b-256 4abe4289046cf38953671e246e0312653a563f3c26b7a1198c5774154b2fce00

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c067464c41ae63730923b613d3e01363307bfd4aacac6619ed555d28ecaf1830
MD5 e1c5f38ffe08012c4c116ed801964a25
BLAKE2b-256 cb96367f0310e327e8901d54959865a13c3075b33cced489be841515c37e4607

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d27c80436e879969fb8130a70f6052b65f4c2a0af10fca1b688404530161d021
MD5 0504e47032f31e1a32c576c204b41b1c
BLAKE2b-256 2e598c49451bfae1708e60b972c4a50e7e0d6043df6815e5c854fc918302ed68

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 5a8d71525d2f5278c932d341911d71a95051c63fb9dc84bf6148330cf293e5bd
MD5 8f2b1e7913c608477135cce5f67b1019
BLAKE2b-256 a9ecb9cb4997e418a537ddb1c3a09d9f84a6a6c7a8c92bae76f0f5a13586a744

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 edf9bca8f4eae9003debbd750474b322a436344c83d2b1451b2726e3af09b75e
MD5 98d540b29a24aee2094a4c09a56e6277
BLAKE2b-256 fb4fc94b0627bca4b3b8c13763cc138f32f225f54ea7dceae9e677a6d7d52a0a

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1b8768ba82e3ace759f3bfca0e67e62280e64eb933bfe498fbb372e5736bdc64
MD5 6f7f73b401e091d94d85b785f9653793
BLAKE2b-256 4e4813722d128695a34e56ab588d94f065fdf35aa84d173cf08b22dd9ea571ab

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e94c597d500bdccb6dedd60cbb29b6f4b5a8bad6f27f354472767e23a75ec045
MD5 a37867046275dae4749b9fa26f52ef74
BLAKE2b-256 0d5e38ddd62b3c7be944cb766a59a9c70a10dff56c4732a8a01dc95bc5494280

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c3f014a94d3c55ff5cde64d510bd3b2e86af1b845a80c3d8fefa0eb2cea772ec
MD5 7b46da008d6a66426f5d9dd195519a60
BLAKE2b-256 827ea098b291576b209aa40ff6496ebc196e82c877884522113864c4d5fd5588

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 8b05dd75f1f1b62f26c570f43ca1230279238a8214beee6beffc4accdaf8ffda
MD5 c60460477817408aa418b71984be6544
BLAKE2b-256 2c1d4d9536e4ac425e1efdcc8ff5129e80158184deb8aefe4413352518c55c80

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 5fc225396bbe960e09381d7c2da2d4ea2e05bdb281661ba6ddcfc5d207e9860c
MD5 f77af355e693f5da33303b354e5e1793
BLAKE2b-256 0cf33b266cdbb4f06b9b9d728def22f153df7bdd3deb68901074e8feab56dcdd

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5db2f7014fc5e66545892e783726acd8b77373785eca57c828a03e68168dee45
MD5 3be34f5ec44862180010e9e2d0d9222f
BLAKE2b-256 b8e1609c25c44b6ca3c2b2dfbd5e2908f2923ffe2d82e11c92fd34890a0e60e9

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8fc1603a8b36d4142ef3408fb5121e4e2e909f028c45ed95a1c7e89d12856e37
MD5 63379d6dd482b5abc61077a7c35057e9
BLAKE2b-256 b6aa9aa0cc8c82f487d3f3ea7fc77d98aa149473ce754e61f97ce94ad145217c

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6d053daa7a819f6163dcfca9cf4bd3d030923c71c66194728f789bfb16c44c24
MD5 262647ec0e76de34158863fb6863198e
BLAKE2b-256 d760be19ab2ecc72d9b30a30f222e9dbb190b8376996d172b475a2a059421c50

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 7af161f6a4c13598597152c3dfe5d34cadadeff23b77bac9a63ca8933feb0e89
MD5 2bf07b4186ccdc0d1ca8ddee9628e94d
BLAKE2b-256 a6e38599a2c39421db750a5e53f7191accfabe44287ee2d3600c5e8c6a3419a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page