Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.120.0.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.120.0-cp313-cp313-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.120.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.120.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.120.0-cp313-cp313-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.120.0-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.120.0-cp312-cp312-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.120.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.120.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.120.0-cp312-cp312-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.120.0-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.120.0-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.120.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.120.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.120.0-cp311-cp311-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.120.0-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.120.0-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.120.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.120.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.120.0-cp310-cp310-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.120.0-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.120.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.120.0.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.120.0.tar.gz
Algorithm Hash digest
SHA256 cbe460c20debd68a6946dfac0257b174b5c01436837a250f71ac322db0c5c439
MD5 e77403f5af6030946b2fb0473213401c
BLAKE2b-256 0b8e32e1afd618b19d13f1e2fcf13015b28610d09d4f5ac07dcd5f7e1a87a2c5

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.120.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.120.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 85d7aa69cda52f548548faa6e3c44a92232c0e9cb6e40d212950deafff138b9b
MD5 1afb105ff76cf88b31e429cb152b003f
BLAKE2b-256 7163b98ef7acc38ef7c36400c7e65523447dc1bfbf5feabb0d5e910ec9890da7

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 49ea093772c4bbd33c13e077c3b4a81fd1a241db89f57957d1aa6a3ca28d376a
MD5 e70a084c0ee9e623f74ff12ed0c930c4
BLAKE2b-256 38826d0d961dd14ad80e11724e9b7f5e7f9cdd4f8997b1bd857fd94b559445d8

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0315338dd31a4dda540ba1dcfd1bbc7027999865611a47706e2cb3d39366bbb9
MD5 244d8548655046c925a9935c63f10a6f
BLAKE2b-256 7e8c97a6b87b0114ef8bc102ec3727b3b15944a37b9c9b6184908943f558b2de

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 97b04ac108f17112bfae95bf0263ecc06c85f0f6227ac46b6252d28a4d800e81
MD5 c3a7c31b590545c46226db0b090ef447
BLAKE2b-256 df6882461bba0d6a58bc1bc7342c4e7f948630b94d2ae61afb44d5a8a66932c7

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f36b09d35a6a5a44b8e81cb0c1bb1bf41ceaa432baf8725e2b02c4990f698151
MD5 27e2348dc8b7300c9da1a7ad7312cba5
BLAKE2b-256 dcc1c089f0d6dbc2ead193f9a3c5e497ae103c2e2c0582716860eead21e4111b

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.120.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.120.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 08b7d08c080be070c65feb552cbdfd3659365de38ea692498dfec36b1585baa4
MD5 75bede97cfbc31c87f79c890e07ed94e
BLAKE2b-256 d8aaf742fa186fd096f17e3e25dcba70900e0941c71603ef3888c2ca7f054caf

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9729431d48dcd07f48b0a00e30be1ed0e0171667cb8e065d3e52f0b3e0a40e79
MD5 dc875be7fc5c0f558c808fbf7e33e60f
BLAKE2b-256 e78d099ac1e699823fde7dec19452f56654a5a617a5f05a1c0572d64832b7fd2

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 418a330cff1eb9c87fc468ac3e1fdab59b0027a2009e4369478e828d2fdb33c2
MD5 aec074aefd7c005fceac1a4640355f13
BLAKE2b-256 2c1a92396c8f2368bd7d6093fd2404ca8b969b370120a4fe187fd5dfd6081f32

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 64af24a678e39d39518f835365469a1e14d89ed1bc9faa4ab580201ab94c8a62
MD5 81d15d53175a2bbe54b79c96f7fd6464
BLAKE2b-256 207498878840a04eb2b043e5232602016988a69ba3e465a0d5396f178dcb7db5

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 243b4743be2e3e922cc3f470ad6c0d1601b9234b87f4862857691db7d2b17751
MD5 b8d315efc39336792b4ffb54dc8d0884
BLAKE2b-256 eb338ed5718da84e25f62a6e71b4b7fc44791ea65dcce115c17b61b29d0de3b0

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.120.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.120.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 5827cb7a9859b5128e4adebafee214d55f0887268a0486b50d4c84f5cdb44052
MD5 ace1c56b67b33d897fda2d0b8767b227
BLAKE2b-256 d42ce3de0afad4586e7cfed94e18562f4383338e0199db001ed6f352ae45e713

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 af1a20deee651eaa0bb9e9ae8f3b4b532a4a8cb7d79fe4f3a3ed5083cc962ef9
MD5 eb8ae8c1283a4a40f73a1e8d05c502fa
BLAKE2b-256 42cdea35aacf9d8d04c48da7f19e833cb7819d31b2e3e6342e5ac8b4802ca517

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 48c1144ce9c92b90d2c667bd3c421b02f3558c3532c8799345ab637b59cfc13e
MD5 2d0fd2434494dd2bdb842c811985a946
BLAKE2b-256 2fdd020046e95cdba45a88d12186f56090d24f7dfb2a77ccde7fc7f1005f9f41

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 abbddbe947930f06b0289d899b4367f2674c853952715d3aca22b6b74d957362
MD5 54c45c0a92d04d45c5dbb12435e92a6d
BLAKE2b-256 6c4574b5636b13e1e7b4b4b0baeb2d5491b6c48988c2fa1a3587c4c8a590646b

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 65c38afcbced4b385be49900e81d494744a27a3197f1dce30178fff304809601
MD5 25a63089d5c70c5736aa77d424df0143
BLAKE2b-256 aa1a41f8126be0844570a14cfa32a9874f6161a7ee48d8ae363e4928d0ee21a7

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.120.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.120.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 4d5bb8bddad4df19b0d19fe928286d717a9ae4f3a7553fb9ab9900eb14020485
MD5 34d288c60595e3ba765cc22e9127d635
BLAKE2b-256 5c5f871899a311404ba54db779162d41afa2869060f83b737c0b4e4ea0ae6a98

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6b1701387f9db8f6b9339368fadcbcf56cb6dfd7a00125c2663cae159dadbf50
MD5 5ddea44293ba251f26ab4591b5df7695
BLAKE2b-256 1aec4b39dab85c615a94748593804764ba52d95df63047710da51b80420c1275

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6dcf8456089c9be84f06a4165cd70358b0301b1cb7366981a0596115e94cbdff
MD5 e8d9f95502d0c617539ac57ae4b9cc18
BLAKE2b-256 b0ab78e2b600399e6d2d9d693155c748c3c4351169fd007d21558f9822dd7e65

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 425d5c122233379fa61eaaa663d2d905095a15aa3875d6d4231ae5ffae632ab9
MD5 215457db427bdebe7f85c71490260ed8
BLAKE2b-256 4498a8c398fa49aac4c1faead4faa35fa00714d680a2f42234cefe7f4bd7e998

See more details on using hashes here.

File details

Details for the file chalkpy-2.120.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.120.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 092964aa365413a4cdd85f3bbab5f45d1d8261bfe1b423d74fe8f088d8390d74
MD5 b28256b800e17c03483a1030701d8fd1
BLAKE2b-256 ac1c5f8d60e912e7dbe07670eb99ad86247d1539bbba17198d83f7f8e63a3add

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page