Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.119.4.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.119.4-cp313-cp313-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.119.4-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.119.4-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.119.4-cp313-cp313-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.119.4-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.119.4-cp312-cp312-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.119.4-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.119.4-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.119.4-cp312-cp312-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.119.4-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.119.4-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.119.4-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.119.4-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.119.4-cp311-cp311-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.119.4-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.119.4-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.119.4-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.119.4-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.119.4-cp310-cp310-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.119.4-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.119.4.tar.gz.

File metadata

  • Download URL: chalkpy-2.119.4.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.4.tar.gz
Algorithm Hash digest
SHA256 be4351a1cd12e45a95f1f6599d78653872dc0d089b0dc100d9ab6747199e8536
MD5 f9981f7e0ed4a6535ab613a558f72edd
BLAKE2b-256 d990ca9dab570c32f07669e3ab9c3b8e45604255527b33198831bed584d1f44f

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.4-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 f545709e515bb65692c60dd1b39c34a89915b6e6caaf73afad4ea70c4e52f2bd
MD5 ff09c7f0766a2c64742b292457b024a0
BLAKE2b-256 ba7b63bfdb77ba92f84962d3c28a217dc1b75ae822a9d0bb9a6331ece670e1e7

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7e1f5cd7072090f5800fd50ffa9ff2be9d3a4af998649d8d5fe666cd1531ce35
MD5 e74870899b1393d1a055455e483ae2e2
BLAKE2b-256 6946104ee9371a41c9e7c74b3e5255419b78110b874d61bad1ce88598cf646b3

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 be9682bece655a4d47474e2dbe230748a32880bea0626b96c05e8b2a38de2bef
MD5 9707947f10707db9a80f7f6f64bbe37a
BLAKE2b-256 439377dacad40023f5380a7e6a4f9db8a3613b8060b31334be7bd33023d473e8

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 272f1b095d29b493b2202b3c6e66fbff2cadf1006db73ff714ef7ada3be608a1
MD5 62a57fb5429a675ab414680166cec17c
BLAKE2b-256 edeb2020c1e3471d51fd7f6acf07a9e9c5cbe354b37d5f3a04ac0559bb699b65

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 0d1419c1dac0c6f4040103067021bdb84f5cf32eca1b07c871686479c4ba888c
MD5 e1850de0143766394315f397f14fad4e
BLAKE2b-256 da20b4a7400bdb68b3b2739e5545567e57a689986e0154bc8403fdd19a31d8f1

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.4-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 19da9a57689363fc075f32bd564121c2f720ca923b6219b10b309934057aea16
MD5 0850283539beb5617d6f8d760db08dd0
BLAKE2b-256 8f13efd8767e40c5a7e64bef7c21fb2760fd90ff3c03162d4847a69eef2464c2

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a670516c588461412846f7dfa4e47505aa964708d41ab6412cca15dfcd9f06d8
MD5 7e14b51a9da57c6cfee5dee3c3df9877
BLAKE2b-256 f99246fed2e82676df4b32463b2c5fa33b096200c30e916c2dc5ba5c0f90c386

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6a2f78c736305350dc26b56fe99c667b1a8315ee39a7e768c5f0f2cd2057cbd1
MD5 1cc4142c2b853163b14eb089a2b65dd4
BLAKE2b-256 2c403537ad29145aea69cd0beb1315bfa1df9548a4f5f3a577fe08980ae27a90

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e1c3d55a6ce9e56d61a3ade7d8ec8d18f5c9c0b8ec5bc26a8950c49438410e8f
MD5 5a377f2f1579ff68adf96ebdc60659db
BLAKE2b-256 2bebb75dd26b6fdbc2b1f7b7e5589aca227701f7ba339d1f31340a0132fa527a

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d431bfdaf9dba7beb3dac1dcc522b01ca550837bb3e140bf3717c21215dfd689
MD5 bcd3f21dd86945aed17be3e0f102582f
BLAKE2b-256 c4bef2ecb20a541f2bda7fbc66be92593a3e8c95f3bc5ec89bd03920a8b29d8a

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.4-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 1a9b87b40f38879521d641cd563dc316af8a8848dd9599574c3cf176450d65b7
MD5 69d8b590bf38b4a14e69bf4b91ecc917
BLAKE2b-256 3ca6c04c74c194131562982de751955344f879899092f8a6f3747438ea19e363

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 59559e922fe915bfd729a8814cc59ab0c765a2637ad6640963e024bc31aebbe7
MD5 cd796e4cbb7b9bfeb326fb294a0019d3
BLAKE2b-256 038a7e4284f99dd6b2398cae87ddd3e955e7523c268b5345587ee114eb659392

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b0045533e67c08e08115121d50b640d5b44d5aabb821908c28cf510bc078cd7c
MD5 1d7a7d97a9e7ff9afb8dbf672f2f36e8
BLAKE2b-256 55b138d563adf8ee9856e354210c1818140a6232496f9a3be2b533ae9383691d

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8587989005f8d4d8357c657d8595c964baf2d3319439ddc6c5c94d830263797a
MD5 e90684b639d7fc55bc5dd84f8d395cce
BLAKE2b-256 6c58cd192a10017ae9115fe688b853739f16aade65b76bf286aa1a3b32b3bd5f

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 01e9603ab5f7b0ac9c9fca9ea89a8447666d8d3bc4774ca11a50234172b509e9
MD5 8c8c09417964b65f74afe4ccbc19f9a3
BLAKE2b-256 86d1a7893269672371eb2d28e8c5a84c94aa41f7c8b31188285d7aa17dc0ca45

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.119.4-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.119.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f68feaad25d2de13baac9a3f551e3cfc4af797c5bfa9f8447e8c0926aa6216ca
MD5 0f18bd5ce2fe6c94ba67d20b504204f5
BLAKE2b-256 61408167248f27811b07533678e05f3fc5037340fd4418224badfb241355fd07

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6dc12ef5e7166e5baaa41cbab89a760865b2ac927a931381e43e6f4adecb0697
MD5 6cb9f6f5911fd07ff85a7d4a58f24eb2
BLAKE2b-256 334af7121333621d1ab7415d07d509a20fca1051fbd46cde2079524551e39cf5

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 685e87fe79d01db3a416ab751b3de5ccd685eb2098f229f71dfb24e041701faf
MD5 1b2940d6da0a5fcc2bf174cf861a3a49
BLAKE2b-256 0981022fac1d2830ff5bf81bd03a7f8b508fbc21dfcfb603806662af98a9cb66

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 67729e68b404e5dcd30276b896d8b39792051779d54bf9ca523ba3bb3d9d4cfc
MD5 057dde1dda76a5195db71ee14daa8a5d
BLAKE2b-256 e8f42ab1dce9eb62a80e1fd7f801ca47e6909aa8801014dbfc4af0a545354237

See more details on using hashes here.

File details

Details for the file chalkpy-2.119.4-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.119.4-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 c144159e791617d8b7b575a2d4f926fba3beeabedb5622fd5efbc95935ef53bd
MD5 29c637bc8b45f29e0a53e0f9a59adde3
BLAKE2b-256 439827d94ee54e6deb6ca619c73b2753f8a77396104cc7699c3cf672d2245e70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page