Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.114.1.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.114.1-cp313-cp313-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.114.1-cp313-cp313-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.114.1-cp313-cp313-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.114.1-cp313-cp313-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.114.1-cp313-cp313-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.114.1-cp312-cp312-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.114.1-cp312-cp312-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.114.1-cp312-cp312-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.114.1-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.114.1-cp312-cp312-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.114.1-cp311-cp311-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.114.1-cp311-cp311-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.114.1-cp311-cp311-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.114.1-cp311-cp311-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.114.1-cp311-cp311-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.114.1-cp310-cp310-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.114.1-cp310-cp310-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.114.1-cp310-cp310-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.114.1-cp310-cp310-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.114.1-cp310-cp310-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.114.1.tar.gz.

File metadata

  • Download URL: chalkpy-2.114.1.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.1.tar.gz
Algorithm Hash digest
SHA256 bf84f5b2bb0c4e9b7f72cf13f3e3cd4f0e78e5e5cd876f75a21994339331f625
MD5 f79c5c54dbc81192d672624374001418
BLAKE2b-256 7afe39d6587d6ed685ef8cf9e2daddb70e4c742ae6f7bc836dccb41e0b1169c7

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 809eb6c9aacc968bf8203ae9b0e110d72355ad33665170dc7807834133d46834
MD5 f473aa4872f087094597ff0dceb78c3e
BLAKE2b-256 004f843091fc3c02651b51776b1a4337da862e9e5e070efe18cee7a3d4e2319c

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 17fec301cc38b6b81a13d93134cf0eb088727d222506b386863565c6c48b4e4f
MD5 91ddc9d3bc6bf5218e8ff80e157479a9
BLAKE2b-256 7c66e9080db2ecfdaf5ae8958c1f9e4f9efd20fff1b1943411e14c651d218cba

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a70ad54e00457103fc6bc44e0647586bbdc7d28806bc5d2ef529f108831b73be
MD5 39db8f1fd0a57dffe51f08a6cd3bfdcf
BLAKE2b-256 31bfff63cd4f1cc0aea102b4e1a6319de6ecac8263d7f5b7fa2557ba971c763f

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1b8971cbb894803195e6b0cf59bc726d2d597b88ba2e6fba2fbb3295af0dedad
MD5 e8279917615959082c83633049acbc7c
BLAKE2b-256 ada2299c37b95c7a45fce9818fcbfe0a25c56c0e437a95c1fbc4b9ef51c97894

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4f4d3aefd35c9ef2da3024f0edcabfebc93edf7e313da62aa2fa01cd3a353d5f
MD5 7c4d68a48e016a957d1711325080d1b6
BLAKE2b-256 e02aa8196017cdfd8512779d85e4d0def10df3aec26b8645f4fc37e30436cc2b

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a628d8f66b10f679bbae637f0ac609d61122268dbad858c57d38213b5629236d
MD5 c58d0bcdb60baddd76db714093de5085
BLAKE2b-256 48d40d5aab72447a25aaeeedd62ca10ddd1c844d9f8b592adb99e31ebcf19c60

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 385573467a04f38cafe9b2bb908f9eb5cf6f10048653ee54d1c17b4fe898b21f
MD5 6d9c1a9c873121160c2227dbb476f538
BLAKE2b-256 acbaf6079378f4b0fa95e4686dd7394ccee3670b18e1bda0b937c686a3b65b83

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 330cb41b293b15d8f06917132410e801b12a7b1f3f0eed4df52a278459deaf88
MD5 4b91fc6cd75f9852d44cfb581598543a
BLAKE2b-256 edfa0ae3fc24936c70d21c1596eff52547beff6b725d191c82097e763e321404

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7be5d82ae4aff01530f98285f5d5dd235940e3085fc617c76e640edf4ae1596b
MD5 58db6c2c37057426223797e52e8c1ebe
BLAKE2b-256 430e56db855662a0fd8fd3b9bd6861e5c27792f10ce2f621fdb03bd529e143cd

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 92911a9d7f33374c1cef6d61b2e7e9f33af25c4246ac74ff7c6b6cd78f50fb2f
MD5 59e2b7989ce2b79bb55ed5a88fb41c2f
BLAKE2b-256 4a33385e834253ad0d08950ba6a333f7beb8497cef47c805722cb44634337474

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 6698815e9a55b5d8bdaa645238befaeaf8b93c46d0ba3284554cf9a2ae324724
MD5 534b02063ea3b611ac8cf6efbb8380fa
BLAKE2b-256 8ee67c6a171fff76787338f6349e97b801528fe5e4d2fd8c1407e0f68e892156

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b1f8ddd3284fc4691a441bc5648f4bdaa60411f59900a655a5e0b26419db8e35
MD5 b8da049f8951548ec6eeb922940c699f
BLAKE2b-256 dec15ab12a85b1e8e95adb4349797b5a3c2b75bbf1b7cfa8634ee797e45d4016

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 75b75d24d47c5e72e33110ec6a1604240ce1c93259caa0a20acd1f5e02091dbc
MD5 9ef7187940558ca3831220705035b61a
BLAKE2b-256 60313db68956e9ab3b41129631801748c41e46caac944b3a638ec9570649dbc0

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d9b1e8b49fc1ae72c0c2c1b22a5de392c1547558a47b19d02799bbe780045c63
MD5 a664581714b69ca53ba0d43226dbdba4
BLAKE2b-256 2f30ec37bc40faea5c26c4cd14e7b4015fc08321bd1f17a8c2ff1d1f870ca323

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 41d8b3d53e3b193a15db104bc601e14e1dd403e224ff4f24e1613342fe23b5d1
MD5 cd1056f90ceea591395c041671aba591
BLAKE2b-256 79a616125b8d8e110b01da312424041f271cd8375a278be504c1f79e46aac857

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 066a82b269561e4bb2cff0e44800181b331131094059f9600f421409f55bd88c
MD5 a21ff2532e9be1ff911cf3afd7ae00fd
BLAKE2b-256 e6e2a836ea8190d3b8f786bd46c59f5a61cee1d64625aaf0a6ee32d35286ccea

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ce737509dff3b968ff5edeec57cc9a5104ecf685d16dd3cac8229d304b398011
MD5 e64fc5dbed65a04277c0193e19c2ea82
BLAKE2b-256 1a4da927b0fed89ae10657fb59fb4a4cc9c1a3f7c3e71e4e2d0d9da2a8026890

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8e47946a6575a4183b88c48c2116a55f3d323b133fe61bc036087dfa90186016
MD5 52dfe7dc7dcdde4d3fb6d4d1589eb5d1
BLAKE2b-256 cecd27f2a06a61356e028c5ca2f8c21fcfc9621d89f04e69027b97dc85919c09

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0b6a031adb722edb7a4f4538c875b1db1aecfc797c7ae586945d84426461113b
MD5 d73632dd3d9b138bcb2a3b011bd27187
BLAKE2b-256 4d148b09119769e47037a0bb6265b3322913ccf64f6a0bb56d420ef432275e58

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.1-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.1-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 cf50e2693db74e764af507b170e34cc2af42758a8adeb77a4de990dcab76cb42
MD5 5352c649774549cb4d2bf8c74beded37
BLAKE2b-256 62216265e6f17149ae842dc57dfdf545d5972726017de743103e2ba3d571bd28

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page