Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.123.8.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.123.8-cp313-cp313-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.123.8-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.123.8-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.123.8-cp313-cp313-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.123.8-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.123.8-cp312-cp312-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.123.8-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.123.8-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.123.8-cp312-cp312-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.123.8-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.123.8-cp311-cp311-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.123.8-cp311-cp311-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.123.8-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.123.8-cp311-cp311-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.123.8-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.123.8-cp310-cp310-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.123.8-cp310-cp310-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.123.8-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.123.8-cp310-cp310-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.123.8-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.123.8.tar.gz.

File metadata

  • Download URL: chalkpy-2.123.8.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.8.tar.gz
Algorithm Hash digest
SHA256 47d06a6099262b78da42be2c44020544a3ad964c8fc3d07d0876bfb9a4bc49eb
MD5 2c5ab4cc36b22cf08b427fa3c9a76f2b
BLAKE2b-256 7e646308082ab3a20bca35f3e3fa3906e0d4a8c5be126432ffd5d0d76c373756

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.8-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.8-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 122fae9fd18e73ca5c7f7d68ae862edb05ae9acfb61aacc94bdd6c6b27d662d1
MD5 a5bb6e48018a16aa6d2038733a055b9d
BLAKE2b-256 d6ffa693be0b10c13603e172ce13670602e59455fa2194e5d61c79e0f4d57b20

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 72d4cc1d90dfe2659cc4e2b9b7974dbb5c4d971e39ca0e71fd5914385cf92d2d
MD5 098fe62613f31fd97d0deeff6c6d2f9f
BLAKE2b-256 9b4dc2b7baa091779dccb7fb4b517a5991893fd5c546b694cb5c711fdb4d0fc3

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 48f351aab1065ce3ae36c397b9444fdb9c2fcc9708245f8a08becc8dd0f54c2d
MD5 11e9f175f483530d8746ac72b159e53f
BLAKE2b-256 40d28eb9367936953049388d8d5104608a111f51998e85633c7deff80bdf7b44

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7574498ed24ed62b526624d65417822b3a912b49187311a570cae40aff2ba6bf
MD5 fa02d5a9af180c9361be776cb4207e82
BLAKE2b-256 d49b6a73ec0311afa8572e36d7477b4fecc0d3401d3bbe60b0df1cdb174317d2

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 928537397e970a763a08927eb564fff153244c485d8bd3bf821717224b4e48b5
MD5 abefaafabaf48b53e5ada11b6570cf18
BLAKE2b-256 cf1e143055e798d1160601156f0705fac430870d6cead8ac6a71e50264c3855d

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.8-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.8-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 494af7e50f8bfb79a53f686eb6dedd564dd5e9ac45c039b974ea3d2ffb282ba2
MD5 d8f990314a4b694f89f8290f09ac3473
BLAKE2b-256 756b1873270b806f06433eb5fd81089fb2d76d7365c655dbb57bc8b07564e8e4

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 fb18be072ea233a1e199b5ebecfa75f7297cdeb2526d25491065f50d35942660
MD5 6c4769161342c99b278a99acc3942820
BLAKE2b-256 eb4eeb591edef8b71e2df6e7d37ebab6e1a9fe0f6f08b4024d1b545e73038cba

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 28ca820d09b66d2bebd34f426690011ead68b5a5d7fc8eb9075c6e6c5d672473
MD5 3abd60c3a4461dcd914aa0c957b0c60c
BLAKE2b-256 dda804ea850871cbdf2135a432d7ecf6e99bd496e08e3be8acdb465a5a2c4487

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 187d11d00d3933de7103d5674b3601b846994fe35e0e4b9a9d54a697866cf61b
MD5 79f0ce85c5d6e95cd9aa15c562bcdbbb
BLAKE2b-256 a7d2538c0b9ef61c3d9be660a986b34a32fcfe6392f677bffc2c0cf2038201a5

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 72bef90ca3ef4acf1502b6d5e7dfee050adfbe3b851c1fc18afadfa37dadf321
MD5 898e08cd1f5c17387059b48e5eb7d7e2
BLAKE2b-256 d90f5c55d696e1a1ef45e41a53f521b20c9385c70ef5f1786bd76bb6e38cfe99

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.8-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.8-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 170b94afeff5e90cf79abd327bbb72a5408577574e793b483432343e37dfb48d
MD5 63a6f8a2f40cb5b83e644f0c7889ec04
BLAKE2b-256 6b880621177356ab9af1a9d1d0092d47eaacec9ae0111ffd25aafd8fba3202df

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e000f30ac9cc80bb233190d0f853802a0ed7bcbb761255699fe22dbe30bf94d7
MD5 6be99d3327a162babb4ac88abc296130
BLAKE2b-256 63f1455bb6217c1cee5351448bc8a910147cca49cb5e09639b9e64433118b267

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 20a4b782175954e3ef24e43d66fa0015cb061f79139e561d21e5c754a7397576
MD5 fcbdf5d03a750c5c3ee28502a64101a5
BLAKE2b-256 1ef6a8fc7fecd3c51aeaeb9429f1cadbda600452d53cfc204ec6492bac254d99

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a60cd4f6c0d0327540659a91dc2eb233686246d29511c6ff4dbd81bbeb9ff70a
MD5 f74b583b868d66067e925463baf6c6b3
BLAKE2b-256 b2bc8a3a5f23c52f1a091da8193da5078b1474a4429c0e7974914d7b95b75351

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 5cdfe354255d5ae1e153b02f9c575cc03a0924448c9e6c5a6a5d628d7e2df87f
MD5 f05319bcc0a4bfe19b8e06ee5a07f888
BLAKE2b-256 29ac14b5d3335f32eb0e30e80c0480a381c7eb199e65d73bb05caa0503a5ff14

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.8-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.8-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2b150ed27445e567484f159e25605800edbae16c4b9d4ab60d257719df93f383
MD5 98a9d6d21a1c31d7e5a3a8f0a89a5ce6
BLAKE2b-256 2e24e008849fe2a4d792c7befdf5039579ea68b1354b1c18a50a1f1072d0b1d2

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 050b051bd9c24965a04060c7fd17e97674f29a89d70da672e3ec50c20c363fec
MD5 b22076fb712444f95508d9a7e71ae340
BLAKE2b-256 6efb804dcd1fe68ddbff2edb71712f540bc6b1b6850d0279f6b7d75f3699c15a

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1c8d63b8a754b38c16c912e24ccae4bf3c506ab4a1d69eaeb8568b646ace26dd
MD5 7942fafedea3178811e14770d6efb757
BLAKE2b-256 b1cda886ec49bec93030917ac4f331a6881dd7e03e306bd5ba1b0ac317517496

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3130e11e001acc013b91b4495339c9c9052729a7ce6a15d1f81e0f8158c21867
MD5 0d10b5fdae17beaeeff29356287f548c
BLAKE2b-256 574718a135f1d09b4ebc4f210213ad969ce2fff2db45f245a039e819e06d11b0

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.8-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.8-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d9fe3261e69e674339a210db7d032d78d02e193c1e9ce4a047f4034768ed0248
MD5 e6ecd009aa22477507c07486bd26a594
BLAKE2b-256 e9e4cd5b01525e19cc31e77733a31e0f1189e977b887b4061e0bd111fb4d02db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page