Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.123.10.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.123.10-cp313-cp313-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.123.10-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.123.10-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.123.10-cp313-cp313-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.123.10-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.123.10-cp312-cp312-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.123.10-cp312-cp312-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.123.10-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.123.10-cp312-cp312-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.123.10-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.123.10-cp311-cp311-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.123.10-cp311-cp311-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.123.10-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.123.10-cp311-cp311-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.123.10-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.123.10-cp310-cp310-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.123.10-cp310-cp310-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.123.10-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.123.10-cp310-cp310-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.123.10-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.123.10.tar.gz.

File metadata

  • Download URL: chalkpy-2.123.10.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.10.tar.gz
Algorithm Hash digest
SHA256 b2270ea2595c32d0e7296da69409811433efd54b6fe5fcc1cb40656ff13f90bf
MD5 dcd480e21fc9512f543685df070add8b
BLAKE2b-256 81582387f38779d1cb0ffcf895864baaa08e3be668a2646e8e88638b21caecd7

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.10-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.10-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 b2df5419a13bb827c7bcbf261522697924815ca41abf661b24547bec9ad21ef2
MD5 c4062779ab2b32aa0e10715b20d62ac5
BLAKE2b-256 7e85507fbca7c899e4390fac25580c238cd52f05b1cbec4610ef3c02feb129d7

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b918f295da75ad865d10ee741a87fd6fe73dd4f119dd120335cdecb66c32a57e
MD5 9078a647e3c5da0259a149d2fae567ad
BLAKE2b-256 fd8c165e2edb5fc2d023fa89379e10c0673e9239bec828daa0092bb86c33349c

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 591b9f772d6795e38a31713305f32b09f6eaf6d8025cafda42ad469bb2b1a6d0
MD5 f57973a142bf9d4de59543d77ad2e472
BLAKE2b-256 ff7706490c8262b15582128fd781b87a31560bc7af9ab3b53e6c4852803702b8

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9c030ada88468be58b82395cd87417a7f0be8ba3d2c240a44822eb90108b8c91
MD5 464617fa176c9b57c7d6ca19623ff03e
BLAKE2b-256 5713db2f4e313f6d5458497e6039dd7a1d39fb09b1fb13f7927efba22a0e42dd

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 151c2947cac94f272417fd57826421257b43bef499df7ec4ccd530c9a66e9f8b
MD5 abe4672fb9d6f185915555e53690c631
BLAKE2b-256 8799f6a017ae3bbe016fd450164b918309e1f914a24cc4212504705587e24806

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.10-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.10-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 0c40ca412b2e76f042bde5bd5a858b1322d2b7939a44b54b379cffce83c2a27c
MD5 6528aef13430cb5d894c2966e2a4cb71
BLAKE2b-256 919d8156e65846d11a2838b53c2b434be026550aaa9d8f470edc191827eaee5d

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9196c2ff8d27bb31073205beff4f2e906f6ca5595c11c3c01e26c2ee6794515a
MD5 977237470e4a00b470d036c3d6f06b6a
BLAKE2b-256 6a409e334a86e0a34edde9a1d3f3535812c1a8af478cfc926233a0dda661900e

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9e4d58552fa88ce88faac0252a29ad10f7dec16f2586169d1e8ddbba17db14ee
MD5 4b623ff890b933301180643d8f4066a9
BLAKE2b-256 0fdaa63c0f1dd6d2221e77b1bc12729cf680815ab8097f479f2d1525547414d5

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5da3b3c42c9971aaa645ac399f5ba2bc7f5aa9938eb8454d73221d2cff57fb67
MD5 700f8e31743f4f195b89ab00bbd4c763
BLAKE2b-256 f27e8e169c63f2b556397bb348be80b2d9d34a77f54181604511c6f4743de2d8

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 94b3df03a320ee8b91994bdcf4e58ef4b7c120901bf70bda61525c089471fa57
MD5 4498088040e86d0a9aabeede92b6913f
BLAKE2b-256 2e8da5646418e65e5694700fc56c1402defa8882fca59cee281efdd4d4c8e36f

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.10-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.10-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 e9ca8969f38a91e4f327ea877d937eb975e9daf8a7f00a4cc0ccc942d5951d5a
MD5 0548442335a5e8206b0e339670069679
BLAKE2b-256 3d6dce80e53ac5f1593fe5a8637bd459aa1d8914314b7a83b80327eebb90754b

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a559cf0fbb7e947b9c4caf4c0f2a95e6d371a9646c26cc2415252179b333f27e
MD5 c0afeb11caed85863a7174b423aef63c
BLAKE2b-256 2f99bb3dd6380f0ef4b3f8dddc4f15da4a852923ffc07743b334bde8f4a12d64

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 500f129853e1a50b6995d76afc9b069fec90255536f923b7f25f5c672e23d49e
MD5 f2ed518f29571b465c20da92a40d8195
BLAKE2b-256 59bd0c485267b351e6f7f63087cd211270ed73a50b608ceb849f9be567ab9bb6

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ca2da5af4448ae90b0108a20898dec62f534859ea0e13c6a8a03d7fa5eb9635b
MD5 b61917cf4a28b613a347744329ed7d6a
BLAKE2b-256 5c6dd872e056c066e8dcf5bd5d027bca4a448d193a07f5557109d02785662cd1

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 41b78bc224307f4981f6bf2557d4baca4f24199ad8ff78f90ad0b6396449d857
MD5 9f37ce9a070aaff87dca440ec319d9ba
BLAKE2b-256 c483cfd5118b3e207ad3c15427c37e830035567446c5fe6dcb3801589a492742

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.10-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.10-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 1a9a7aa331bdb1f7056482ca5e0dc20f7665ef67bff313e2a6df209fd348f6a0
MD5 00361d2a073a1fd9f5a6ed4818b51caf
BLAKE2b-256 281a50818acccf014044205bfbad302f37d2d533998f7d388a36a0fddc7e3576

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 da635d211240ddc47761ba17d90d7b73500cdea65332baab8d75920089619004
MD5 c9b66b90834cf22ff8df7b41a26c0862
BLAKE2b-256 79ebbc6368997f7ca866c4224d2a852da2d4f1022f130e25c03256829c418ebb

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0f7bcba130543ca9fee4437b270ca861010d555cb386033f1265f59f773b8fd0
MD5 5711d181031ff6b41d3202f6c43b1977
BLAKE2b-256 8fe2b09af62db53f1947a8061165b0b1a844b6d4121b2de28a0f36fcfa4ccf8b

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 723be174976a540e65294f70e36543550e30c17cb11b38e0b45b7b6cf163a7cd
MD5 5d551980fb0b15e10b494bb8ffab3744
BLAKE2b-256 55f9b1c2b3f41a3e230cbb6fdaf8d17cbe363ad0f362a6cdffa06d42ad2bb400

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.10-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.10-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 63c228897164453133569424185e72b16ceaca1224c1f0f7eb4901b62b9ec34a
MD5 6f6d2b01b091a22c2029495538cf75d4
BLAKE2b-256 762be2f3163c6e56b3c26bc872f9cb03cd14293c80d62f1708875169db71d66a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page