Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.123.20.tar.gz (1.7 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.123.20-cp313-cp313-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.123.20-cp313-cp313-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.123.20-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.123.20-cp313-cp313-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.123.20-cp313-cp313-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.123.20-cp312-cp312-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.123.20-cp312-cp312-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.123.20-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.123.20-cp312-cp312-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.123.20-cp312-cp312-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.123.20-cp311-cp311-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.123.20-cp311-cp311-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.123.20-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.123.20-cp311-cp311-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.123.20-cp311-cp311-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.123.20-cp310-cp310-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.123.20-cp310-cp310-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.123.20-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.123.20-cp310-cp310-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.123.20-cp310-cp310-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.123.20.tar.gz.

File metadata

  • Download URL: chalkpy-2.123.20.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.20.tar.gz
Algorithm Hash digest
SHA256 4787fa6661ba2ffae565ce30a169743ed96b1163e99bbd7451094285b53c3aeb
MD5 8f6702592308a807d7586458f2e0af98
BLAKE2b-256 5ca2c7b59fa9226a3b822e74a5649d6b7647e65ca75513e232c95a4de04d70b4

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.20-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.20-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 51c264be34d9eb01ec8bd07025d951acde61956aa6abb8a54e4d770d08bbe9f6
MD5 93a2708caf01adbae18ba0fe8b2974b1
BLAKE2b-256 8ecabe7e51c56bc611dc4c5e9b5530bc4d16c985b0afd398e02976492ad1a85d

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2bd440bdec85f209ad6e6b6aee12e4806ad36d450a8f56055321984e535599a2
MD5 f23e6de78b32478e22670e54a71d99bb
BLAKE2b-256 5e73733895b0945bd5262c758995e2e44e084fe80345d2fde2f1f652f50b6b88

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 30a9cbb1306b6243463a7091f93aab2a19a7235406cf0f44f813c28ea5519618
MD5 c37a50bf0ec48bb92fff16503270d2f8
BLAKE2b-256 ae5334e1d94b7c3ad8644fae4f1b11e04a5301754fc400e2869735a92a1f91e9

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6866b3ba3722a27e838f903d5944869f2b53cd69249b016ce990b7f93c00fcfd
MD5 e4c9b58385b1fcdfe19135903688eca3
BLAKE2b-256 545923331ae172b38753e423813be91302cb776839977d13ff12c7ed8699ed61

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 c54f75ab15c96c57c01a61f640a739d4be439e8743bcd4970dfdda758841c0dd
MD5 82528fd894e52cd117cea74e2562134e
BLAKE2b-256 587ecc7350826b9fd6298d0b5b47ab542981cdea32fc7ade26286aa07b7962b1

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.20-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.20-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d80930287c5d9ee604bc23367221c5377009a11203985dc72814250190348795
MD5 b14b1392c0068c503dfee4d3c5b6b739
BLAKE2b-256 b7f78abc15df797163b0bf96c2bede34d32ca849f1f25d2943c879797775867e

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cc645c5d914f6d35b0d3d7c564171e29013e2f700a335f03ce8fa78ed77bbb4f
MD5 c2b8c4943ba93fee357dc7c622a088fb
BLAKE2b-256 6155b49fa54638e54fa08a9db4bb04629003de867da0703c988f25f626d01445

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 dab4da9c34b45b46f84b6a2358a0aac2f47aabd648dcc74e8296656acb683795
MD5 4d6f828f6a044babf380cb9147321b45
BLAKE2b-256 958af40ce586d96f76e55189936bd7dd72a063915ed62928f0d5e927047530b8

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d8d8f67ddd3755e6922fff7c07707ef337e1003a9eebb4bd759c670a6517033d
MD5 18c6c1b9f5f56e20079e504fe619ab43
BLAKE2b-256 e2c782693896b2f7275e2fc3cb83296638cbf17663de5f8b1e6f0136b074666f

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 17ee39302f172ccdedd57a2799f2c6071624a4ea50415daf34c58576fa69840f
MD5 e317fe76a172ac5b0618663abb6ebcc3
BLAKE2b-256 31c0a2083dc6529dc79fcb531bddafe804b8c1649816a111067ad4bc28610a03

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.20-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.20-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 60aa5cce370e642d1016978c4cad20f5e39b8100e8263901bcb0c73c3d4e3668
MD5 6d442a2cab6bdd40aa5aed75d561804e
BLAKE2b-256 ea065f8ea9c7d38855a6198de84cf12dadebaf43462036cdc0d9476361e13ba6

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 386259fcaf45ac91915d4f82ce0ace1033697a353c12167c62a436c9bbe9c70f
MD5 e8c23e14c88ba6e4bde777fd17e8d396
BLAKE2b-256 d86b082d82be0f46fa312d8e5b32e82104cdbc6bb012ef0fca755c85a4766e5c

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f94001997f6ba8e796ed71adb466777f415fbb9196470f3a90187fadd6a7742e
MD5 611bdd82b4e6961f8bd712545cb869f5
BLAKE2b-256 7910f75fd52fa4866248ea4abc2d5ba922449cbf1c4036d9ae5156c743be05ab

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7f940ac3384dce84f638bc4a4b7c8ef3b5b9b54d67ae46d859e2d3f6f9ae4dc2
MD5 fb1b51ce63d69527b335e8a1ff1ebb6a
BLAKE2b-256 cf96d402a7d7c3edc1cc6041c30cb7aa54b60079fdd337dcde102583ac975d77

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b197212ad3aa6c67945847080325f74218cb6b9fe11b0904ab1e8cc863d52368
MD5 1db476f16b137286464f623cf11447ba
BLAKE2b-256 1f866d26195b867197fa6d57c3e281b21686299a8560db7d288d3d8eb7340bfa

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.20-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.20-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 70f7840c7234aec9ba6150422dc747945bfe02f2c062a71783e15398fef9bd4f
MD5 8ba00476fe0d9ca7d3fc3326f345dd8a
BLAKE2b-256 50fd0436622d2a195a8ab0b2620e450eea4040d81dcb7f9b23886f4117408d66

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8e3817f2d02ca1f8e020bbad906deb7a61ae3fd31f0c6c38291eed9ad39b779c
MD5 a5e6b4aee831a676fdafe13355537a37
BLAKE2b-256 ddbba9a4350fbbc40e9dcf48c6dc96402a0749f5a19815d40c7e0db37fc718fb

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8acb21dedccbc13fba0e3a99c0b2f8cf028e58ab09d642dc57d6cd0dc9b85da7
MD5 ce0645d6466b1776436d1fe6653a04c7
BLAKE2b-256 9ac5731a279b64222d1c2b663d210e7f30f1e4b4b13a638529440650f4b52787

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 32d6e608b31a838309ffb6c5b275d91db6e784733396feeeef48feea3668f948
MD5 d0f9ddd0111d6854c18633889ffbe96b
BLAKE2b-256 67023498054691d6d6fc6ae6dfac5f37e2acb4e196bf53ce1f8768cfe91b9832

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.20-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.20-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 930bc6944b43dfe645bf8f2510e67e39abd1ee0c6b1bf9413b70bbff7037c083
MD5 ce76b54e20d0e23a0e4604eed6b3b910
BLAKE2b-256 849b0fbf3d09fa6285deca3a2b98f48a582643682054fe90c105ebc205056858

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page