Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.113.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.113.0-cp313-cp313-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.113.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.113.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.113.0-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.113.0-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.113.0-cp312-cp312-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.113.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.113.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.113.0-cp312-cp312-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.113.0-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.113.0-cp311-cp311-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.113.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.113.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.113.0-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.113.0-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.113.0-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.113.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.113.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.113.0-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.113.0-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.113.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.113.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.0.tar.gz
Algorithm Hash digest
SHA256 dde85f7f5adff4c96d5529e40320aa298bb1305b52dc4ca76f9dc7c64873004b
MD5 f0b3ef6a74d1e6c31940776ecd8ea3cc
BLAKE2b-256 034b3e052fab9fa6d941aa3cf43fa15f75afff61a6193dee50f328bfc76ab55b

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 d8acfb68c95bea0a42b611e15115316425f36fb75af6870dcd3a5caf9e7d2e2b
MD5 8f4301c4eef28953bbb1c974a901f6e1
BLAKE2b-256 4f6a07b8a9c9195f1fec0ab3e3d383ba8c16917a421d49fe0ffbfc8f71b65ce2

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 30708c9b5f54c6dca4df13cfa9167e803d8aa4b417f9022dd684e111ef236d4d
MD5 db9cacf0083706bffde05dd74ee34e6c
BLAKE2b-256 1c4e62de8b43eed68e40ba788ab65bd9d5c8c3e65eee32371d4c6617075c009f

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 92a27486c31469eb24509779d419947cc1550b3ab3485bc55a30b3f6f2a246fa
MD5 368d75b2ab906adcf3f5456cecdac2b2
BLAKE2b-256 d0bbcbf148829191ffcc02340f4c5adcb14b0eaa3ebd87726578d477f3fef6e0

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 49c9406e09fdec12bbc29a64416749c1ea8d5268a3ac1203943432a5f5b7591b
MD5 a433440fdecb6fdeec0d47f1b8769f4a
BLAKE2b-256 2bea407c7f0a38b7ca4672004f347d42019c99b25c9ae8fbe0aca7535f94ef7a

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 2fce7c5d0ae94d5f6d4a9414627c035cc402a1db0d6df127ef11ec814ca5fb49
MD5 06c5d75266da33a1b080a694c85c08ca
BLAKE2b-256 5a663df31cd374b1d666073801e43c1c8215be83c980d0e01b5a0e52242f148c

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 43d0dde96abd8e87eb25509bcf33417e8e23c423df38c466ff04d873d176eeb8
MD5 a64673d2a5f954f2509cd93dd3db41cb
BLAKE2b-256 491081bd893b838e57e32ab8a4c278637c4f75e30ae644454389269e06bfd3c5

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9d6ee4b15416e8283b249962531476c40a1aa61b02120ee19e2050b2b5c2e2b0
MD5 9ec2f4c3798a37a8b707137fbb5b7e57
BLAKE2b-256 554e16ab2442a29697d2f879e5c0384ac298be2870737236dacdd1e1e99091b3

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b377003fdad8b6fc62e7480243ead731827b7ab7bf0e497473672cc1c07b5693
MD5 9059b241c1e63524cbcf59e332b6d166
BLAKE2b-256 b035a440e06af3b43fa7363d84cd958e2eef5250addb8c28fbaa66aa73ce223c

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 081785dfa8a46ba500bc818e9af45fd99b4aca6dc7ea0c8e557bae0083ef5c96
MD5 24947250743dca06fb636791118ec42a
BLAKE2b-256 593188a3c2941c5241bf38ac0865bf4398ec39c11aaa9e3642815c1dc8d65a24

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9346c573829537e4f06397bfbce8210f327ef3a1ffbb22d0c743be282fdfddc4
MD5 91e0ded9e3ce7fced075924cff2220fe
BLAKE2b-256 88d355505772a672b6255b4d2fafec9e757ac52a551be092014b3a65705ec161

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 b0e8ad37b5d7f9745e7b272ad7d16762c117e85aa82db7347f16a6d96afb8c2e
MD5 77bf21fb4cc34e9148fd769d0288020c
BLAKE2b-256 00a99086050d0c4fa3b2d477cfc61b69228e80386918989ac499e3d120f4bfa3

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 be58ea19ec9b357de3973e235cc5b37cb193ccda5a5968233346a41316485846
MD5 fb4eba483f590103981edffe3beb4497
BLAKE2b-256 c48583b2a5532cb29bd9d33cad3a827eccea01ab10fed8cf904765896fcb4661

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 114da1247ca2af459eacb0ed49d3897da1cc23bb0331a2e5fef63d8e1542971f
MD5 579591ba35830746e4b3d7adce85debc
BLAKE2b-256 976dae0dd09c9afa4ec251d273ec1c4ccb65e6b745b1f272c9e7cb5e1815bb38

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 791b39f9c8226d6cc3bec102703496bad7f3128a170e6035734312609545c998
MD5 b27455cdc20c0703751009c8dac14ee3
BLAKE2b-256 f325471a6056c20d90f74a61d40fb4d16fd05d0c29cc42632d920d4355cf79d1

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4325ba29babba03ce4783d3b126d7f6991f3ce929a2bc7aee709681d12b4d762
MD5 c29d2bb54bc1ff09f39f4ceb24f433ef
BLAKE2b-256 052f19e1861cf45813583c9ad20b5ae349cbee1a011677b3dd4fe60e1a2df0ae

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 46ee9c9d7d0b5ef6fb860b17690c032c036ceb1262aac840bc30ba9d2d7eaba5
MD5 0bbfdefa6173b087cd795ee8f3e54fdc
BLAKE2b-256 1a4d0de29de9f18d7e7ea07338370f733eede3ca3f198d0c8c21b1a28d2966ad

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a031b12df1ec4a3b1aac94165addfe464e701853ac84e68c5f492f2615821db8
MD5 8d163d39defbcd88028dba389cdf9d7f
BLAKE2b-256 9e75ad274fc6f5997cf98bcfa04d82e6ab1c3f1d0ac8e1f923a1163540840f37

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e58f2432b6186bbcce299200bb0fa4fc65e1d19bc168ca8a8e93629c9b7107f6
MD5 ce4849100b3afc29762c75006b4a8bbd
BLAKE2b-256 6466a9996accd95148ea38d414351d681192194ef67103d703db609bce53253e

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4615e86ff67df54f64db2c2b5e970b97e5c6cb38e72a0c515f2ab5c3e241ef96
MD5 18cc0333d215f30b23bb21b21e337ae7
BLAKE2b-256 167c211dcfd6029b2a6df8ca1f306660de1bdf50fc7050b9994c83c0fd57089d

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 3c49aa23a7c3888d688c96efebb3a06e1ce2ffd70ecb96fcc192e5c8efe05508
MD5 4be7ba95ac80bec7aca8bfcc0d496922
BLAKE2b-256 ace0fa9efb4db5838a0cfde943aecc57cede29fde75bd08f30452042b577e056

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page