Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.112.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.112.0-cp313-cp313-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.112.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.112.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.112.0-cp313-cp313-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.112.0-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.112.0-cp312-cp312-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.112.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.112.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.112.0-cp312-cp312-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.112.0-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.112.0-cp311-cp311-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.112.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.112.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.112.0-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.112.0-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.112.0-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.112.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.112.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.112.0-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.112.0-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.112.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.112.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.112.0.tar.gz
Algorithm Hash digest
SHA256 59ce636c13a5d7d55d6ab83724ee972627601632be4235e1f9dd334a6d929e83
MD5 efb8c7997d03f636e4d0efa5b9262903
BLAKE2b-256 9a43167b6f6a04b78b387e198a7006f40a9773f96ec26dfe6926fd64c338f0cb

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.112.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.112.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 960f0d66dd5f7c762463eba4920380c992077b6d25d90a689ffef60978967b74
MD5 cd6c5a33714349015d7aa641cc5550c5
BLAKE2b-256 754b61d9d392e1f227c93720610cb3bcaacc0732da624606b25b7d2d3edc73a8

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 dc27a1864e031ae6357b5bfd3b5eb738316315c5bb556d316c868be5c32267f5
MD5 0b809812b90c15a0abb34aad3eb289e4
BLAKE2b-256 381014ea4c7033f04d3d38aa798cbb39f71be33af5085cb674f5c9d89970ffd2

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fd7598777bfb3f62c326b41d015740e83ee0d961b8b404e7ed491b4f1827abfb
MD5 0573d05305907d27e37f6e4b4ac9b1e0
BLAKE2b-256 3b0aa77e35f95299ea521b0bac776affcb79582562d70098a5c76b6810613908

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3a95e7e357c7c424161d795b4bb9efa3e42d105486a1f1f139ba12f75f190189
MD5 6e33cdbca5f227f0f76dd93c7c85255e
BLAKE2b-256 1221a30cbaf165e82526e4d40c521406b7a4de5682f9f17d14954f0d0a6e8ada

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 c9f62295d90b0c61870c51a895634c63a3a689f08112c38fbbfd0e44c1e9b28b
MD5 3750cd2a2581c82c936b728d0e6c2982
BLAKE2b-256 cfeb4eac52f3d97eef83868f19636a381d9e50c6bcf5fb065ae3f53ff5e099e4

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.112.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.112.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 8477d7d9fd113baa8807b9c3c6ca92e303c59d88ce956b50fd983758dc858b8e
MD5 5faabfd43aeeef6db720610534a42138
BLAKE2b-256 477077998a6c922657c5de81b83586fb856ff15432e75ed8374c9ada0a01be36

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a709c2075eb930731d8970a8a40b3abc92f30ccb98dbe67bc1b4e8a5ced72447
MD5 f2300d620c26f12672813cad30e66005
BLAKE2b-256 dd687ee3e18367c194dd5ef950107eff9d3b19fdae258a9881d5ce03387194a1

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3dfc355bf46da07dbd2b1de0ac212181d18fc61db422da4b7ddea65f6dee12e3
MD5 38231a9aae486ab84258eb683b6b804a
BLAKE2b-256 33e3539ae6d0caad967e02fd921786fe6a68fbece16da14a4e25b5f5f7d845c9

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d5da47eea8bce448a466b23376dd96a6fd6d18034b181a54a9132cdfff250aea
MD5 1edccd1936b89f65619550c2373a6296
BLAKE2b-256 610036f354100cadb64198ebab2bf4d6bde64970c02a461dc18be589060a665d

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 485c731b31665bdb929892eac629f09fae61a9b38e4cf8d2aee0fdcc473bef02
MD5 6d827ca095596c8c715f64eaa3702de4
BLAKE2b-256 8144c23dc29126c925036c6f8928359217169d9a74107d201bd35a7ae589d24e

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.112.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.112.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 02a86ef957ec15faa0825313b82fcaea5df1b4824e2794b34fff4d137c87e19b
MD5 6964f4bbb7c633b39817d7c52df0207a
BLAKE2b-256 16a57129e5e3596bc55b7609bb35b2ac2a96f371620d51bfead2b0d09cdc6de9

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 411e2ad6c87ba395ac254e911806088adb17de2b34fefb1de498016e9a2da332
MD5 72ea61bd354f7c38ca946aef8e39bf7c
BLAKE2b-256 7ce08afbc5abab6baa8a92c9793d556135b96773f28a241149b3935e94998fbe

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a32ad6df998e198fe419e37fbe380ec7cab0261291934b2a5a8678e84f501488
MD5 08b791afc35ecb023d8c0ec2efe5e122
BLAKE2b-256 2cabd7d55f1a4a0eaf10c77dd7b372b959557d17861b03114c6fcfc0349db0c9

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 653845e712e2998c91cba0c4db684eb0e6d3f2b9d2a28cd830d0d2a0e8f21d21
MD5 5ddde23e89af629aa7f4da217329330e
BLAKE2b-256 e6334c7b05e02e1b670ad463236d959eb66f44d2a99b950ae7af4fad229faf87

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 48bc7e59595507f2d2ed339abccb56a0ad8f173d6324e65e7bff43b9487523e6
MD5 981db5aab0b8d93dd7f23a404d489ab6
BLAKE2b-256 035114f6abab0cafe11f22e4411375a8e13358aa8c53347fa290b065ab01194e

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.112.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.112.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 4db7a157e6d81072a6213c34221d804c3cfd6c6ae5e0ea33bfe520976d69c9ec
MD5 15a874bd822c525222610fd563bc7ed0
BLAKE2b-256 d0e58b341c555a0f10789a3ef038233e6f90b0324e5623c46093fb0ce0da1709

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9277a2075fd0790ce6b010617b3f9d636378421ca9abae44cc889e75bf7ce1c2
MD5 986c44974abb4bf165265e13d52ae986
BLAKE2b-256 4f5f3e73a44847ad4132c8a6a6440b1bad6252f742802aab51e5e267ac5c1abc

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 170be1f99388af51327a2d1cee742521e4e793e776215cad205e1052403e5a1b
MD5 fec8dba9d51c2ba66b5497ee6f4ff2bd
BLAKE2b-256 e8ece827e4ad924c6d2a8d137b7dfa15628d2d7172ac8fff3b6e8bdb85713462

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 51f50758734a9fffbba24e7fca5c2c502ba8700793526ab5d873d3425d31f688
MD5 2340c4e8a308af5d7aeb0949f505e75b
BLAKE2b-256 57296a9b4d413b1ec08668d1f538d8ea4dedf3cab5bab307d52c754c063883b9

See more details on using hashes here.

File details

Details for the file chalkpy-2.112.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.112.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 e75687c57d2ee85f86a739567fdebcc1a17e512f31db2b1568d15146f825aa55
MD5 b8994973e50f5c83dc038e52bd5a845d
BLAKE2b-256 66738b33aa0b681f7e188353b682ef7067d27cea71a9ac9abb003b68a63902f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page