Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.123.19.tar.gz (1.7 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.123.19-cp313-cp313-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.123.19-cp313-cp313-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.123.19-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.123.19-cp313-cp313-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.123.19-cp313-cp313-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.123.19-cp312-cp312-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.123.19-cp312-cp312-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.123.19-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.123.19-cp312-cp312-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.123.19-cp312-cp312-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.123.19-cp311-cp311-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.123.19-cp311-cp311-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.123.19-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.123.19-cp311-cp311-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.123.19-cp311-cp311-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.123.19-cp310-cp310-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.123.19-cp310-cp310-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.123.19-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.123.19-cp310-cp310-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.123.19-cp310-cp310-macosx_10_13_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.123.19.tar.gz.

File metadata

  • Download URL: chalkpy-2.123.19.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.19.tar.gz
Algorithm Hash digest
SHA256 6fc58d3dea67a385f05381299c8ad44071b2196599ec80292e3d6f6d7f19b9b7
MD5 bcc0a52f6064979c55c1eed9615b5d0e
BLAKE2b-256 c764eec96cf414d68c0184b344c90c473bc452841eabc2a5dea1e97f246c45fe

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.19-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.19-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 aca62355699852911ccb618e07840278ae7fe93bd832474df985fac5542c0332
MD5 3e2fe64ae6e8062869900722cf87b626
BLAKE2b-256 350535568456f17fb494e8b7330b1583393a1e6d88400ac0362a3e68f9ba3504

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 491fea759dc616065405926e75924e33b291d2eb1ae4afc21c9d0f70539d9558
MD5 c67e08a002858814ca7a03e34f1a84e9
BLAKE2b-256 275bee70f73d8df9c4efea0a9db2ff449acbf0bc0ec38cb8ae2586471425625d

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4c6b6e31c6c8705bf3c3e9c181e9a8e071df96a6bfcd2b136975aa7876a808de
MD5 e1737f4486d1003d9ff469c56dfa91dd
BLAKE2b-256 7820d2f9a45099348d5ee51b2c995f665e4a5e2c062dfe80c77912b2ec4dd262

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eac2758ed85745f7c1db7a0ef045c995f23fb03d5e14c91b772ebe99039b60a6
MD5 8248c51233e5b1bec5a4da80e9c7c550
BLAKE2b-256 eb1560550fb28f34ce059539af4ea254bca967bee5852a4799b6ab81f0900356

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 1972f72cdb31e16546c8f1e59c91e49ad2d79ffc3480ff801579cafa2e983ec3
MD5 edfd1dd4405f7a311deb01599f53ed51
BLAKE2b-256 2bf1acc1d0d90d9df6f7a2d632a3bf5d9551e038e79bfee34aac24a3f0bea1c6

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.19-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.19-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 78f70865c41ea1b5206e9b499595d1a062a1c7e9c77aa04a17f24df9b7a0a04b
MD5 d8baf850f1308ade915d73618d792db8
BLAKE2b-256 79a12a2d7227ff6ec59f3c65977b9b2537d7c3c423e69e338b42b2276809d8fd

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2a5f38737a742e61ce30b6a91981e9f79e8ee150c8b33ba0dc95838cb69559f6
MD5 23c053d8e25403fc4c83b9fad739ff96
BLAKE2b-256 bcfbb1a71868e9425c563ad9cceec08ce60a74f1da91bc0b5286f464367f4447

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 32248dee0883d94ad33573091729d9bf708a83fd6011f6e85f5552f175052031
MD5 e02e98dd17569a7a3ad08bac2ce73cdb
BLAKE2b-256 a47588f21e23414dadb76a6cb03cea681c4563d1b232a4c4818efcd52a35c354

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e7e345c16da3dff2b7952c4e24379ad17e9b559e17ad6ef0df39eaf14e745052
MD5 b3be2110608c3c6c97d364c679dbb6cf
BLAKE2b-256 617df660c9a92bfe93d3c58442bd504b245c9d98b3d84c5cdff5844361a8e62f

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 dfc7ecb1b158c5caf90c2cc95ee1ce6766e1156fd0805aa49cf1db89b180dc6e
MD5 1ceaf8d8a9a14926590f14cfea86d58d
BLAKE2b-256 7006a45914cd9800757f6a70a45b0c0e55a1f42dae432c240568c50301e89025

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.19-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.19-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 8805d21fa318e8f2435c77681adef6c95c3490ec91a1ec3fea910a042b58b2ab
MD5 b6a895dd085d0620a9e0c2922ec248b4
BLAKE2b-256 6ce0a81d18001654bf03f7f827ead9d00570df56aaeafaea802408e4de49dbbe

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c6305a8981bf74ec30ea13c91f0c2fd44aefdbe17911e4f2c44b3e462fbfd569
MD5 102e24836132271bd6aa6028d95ae809
BLAKE2b-256 f1203048f4d120b764da94fa3d08bc09267e7afaf8b0647210b83ac420e22866

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9196cc27f7af9f006e86e6b836f98391d33086acb287d047ff06d5b86853a810
MD5 52ec55b00dfd67211834673240143003
BLAKE2b-256 da0dcc0a26bb990de3f7e33e825ecbecce41f4f01130db90d8462ed364a74807

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ae81c0f8b2952ae40c24809ca4a9f600afe3e34ec68f6d09dc7f38b650328f4f
MD5 7915269ffce7536806bb71ec9caa277c
BLAKE2b-256 a8b828eda1c7329c0b25b7876f7f442aef0f53c40cff3c30852c50d4ae97b7b0

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d045e51a878493ef0a1b9582bf5a40dff8cc2f30dae7dbe856ad8618aa347683
MD5 16e39233f70d56d3e3ad94ff9d45a182
BLAKE2b-256 bca429defb33e65909416ae90e11341bf3a202612e132077d63eb1f1edf2cb8b

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.19-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.19-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 fef25895ba4f420d7faf31c9cd6d34670654605503d8411a6a88b330351cf960
MD5 246a41ee8fe2401f307e8797e557c06e
BLAKE2b-256 f1aaa4e07e473b935201c01ecc238c983c89899b3c6793c9070db335e15a37cb

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 995b1211817415a56be1a4e4a832441d6e7c630695e8a376a3b98385a0b5ea35
MD5 5e785d5cce3883bce13e0b97e42f68e5
BLAKE2b-256 54c66b94c97152ae11d9a562a6a73f09deae8096b4c801205485c48af395d0e6

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ef7266285b29763bf77881071165c2c1d17b1ad2d5b458912a7a4efb1038c4d9
MD5 c09b7af3a756e518146345be2f08eb22
BLAKE2b-256 2f9587eaa57496cdff51b9fcc374602df731a4799c29256eef0c1e10c7436928

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 480a417692d6d065265e8cf85520c62fcba7b7d6541093a148293a4ce626a662
MD5 ec2cef7d66283cc8d52c2595e02c856c
BLAKE2b-256 3c829e226897988c410e80481a72e5a904af904a40d1fbea7dd59ad930957479

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.19-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.19-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 e1e0e89256d8740f2973afc1b1a60a16d2b9a9e85630c7c270a28cfb15658c30
MD5 183b266cf1c8b899e25fadfb3174a72c
BLAKE2b-256 2aa521006af269bcfaf85025d9c27d3ed98edfa7f20de71e84cb48c2776cdbb0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page