Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.123.1.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.123.1-cp313-cp313-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.123.1-cp313-cp313-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.123.1-cp313-cp313-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.123.1-cp313-cp313-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.123.1-cp313-cp313-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.123.1-cp312-cp312-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.123.1-cp312-cp312-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.123.1-cp312-cp312-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.123.1-cp312-cp312-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.123.1-cp312-cp312-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.123.1-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.123.1-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.123.1-cp311-cp311-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.123.1-cp311-cp311-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.123.1-cp311-cp311-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.123.1-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.123.1-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.123.1-cp310-cp310-manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.123.1-cp310-cp310-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.123.1-cp310-cp310-macosx_10_13_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.123.1.tar.gz.

File metadata

  • Download URL: chalkpy-2.123.1.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.1.tar.gz
Algorithm Hash digest
SHA256 5d8eeb45fea2c4fe0093b51d357dde1cafa4580b0d95bd2fde64fe0ec4e64dc0
MD5 7de2bc50c0c6d13c44fa999058bb982c
BLAKE2b-256 9162baaae13d01fac61b17f511f9607eff689ac770bc603fd07b7f79f3212111

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 b921d1a7dbe73ded8ee3218f2ac2894c8c693a9252f8b8ca81a50f5bb8734a2c
MD5 b9727b3026812f551c26177af7d05077
BLAKE2b-256 fbcc1e079d5997a64ca2fea8fba55daa2abb1842d108ce9e697145e29d00ee4b

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 129fa551072e80e27d91a81fc939ca2358634073c625cc1f98c6b6d45dd3f3f2
MD5 a8a597e6244e8eccaca5e769cf1c106c
BLAKE2b-256 c1d305a53640288af42deee347f4ea08c55235ea45385617872aef8690a21bb7

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ff326dd6bbe9187ab36fe8fc08b6cdd4946579b725a419da80978b94df8a7e4e
MD5 c84a847df01787051db18c822b38b0d4
BLAKE2b-256 0fd1aa64a1dd48456423cdef910f7347fe0d8d1702f89653d404e9a5e0f553ba

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6c7fe51535ba9e5b7d899b37409f732909caf31a4f82c550bb3fade56d9fcf25
MD5 ad5ae9c89a8ae713eb5536d262decbb0
BLAKE2b-256 b43f2b805ee0d021633edbdcd5563ea8ef9f82fc05b788e5e1a9c5b79543f581

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 eabde96e7f051bee10723fb98e2d5c973903fb7b2e051b1ea1784c20a2af1095
MD5 f3cbdb12292b50ca8c4b5dc3ea1b22db
BLAKE2b-256 4b3df31d15f14233e79b7e7ff431e24062cd6282b2b7141b174e732b32f44da8

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 7f71528e7e15c8ae131fd2298b5dd7d3a443dbadb7cb5dc0e95eea3517afa5ef
MD5 5408407c11257313a54274d112a85202
BLAKE2b-256 cef9a9367b2764919dc96ff7b3be986cb941bc103bcc95b5b4c4346de3a60c7e

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7c9283a16513792ed1a109074478c7b30315a69addb0cfb698cb4e6fc0f173ee
MD5 af55442eeaad7d6581e18336326d7667
BLAKE2b-256 5c016e1ddd5e49c2c29ff522b4fd2028f4ba913cf7ddcdfe6a15c8a559955e3f

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ffb6b35a375755a538d66d773ab48922797937d5d6ce4dde406322cbfd99fc8c
MD5 f78161f9a12d2a43104e98ee16e0d434
BLAKE2b-256 3107e41228e65474087fe71cdefc7bafa30cc8c348b89de5674b83fe0f4114e7

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6f5324bfbe7509f5a50ee2ecb64d71ac21c6edc7e5b0999c35241991d6e22d4a
MD5 992726d3a0b4c59e0434fcb3bbc94c54
BLAKE2b-256 d758a6707518700d4c78d6878e085cacf879586e970d423f72b552a106385988

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 833c396cba2bb176ee8e4809f88106b305adc8e678d2f1e29c648e73a5faedda
MD5 259e5335628f791ece88074309e955bc
BLAKE2b-256 646593b49c27f68e4d47380a757247202bf142ed7bc95c328c19c418d98dbc79

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 2299fbe41270df4469354e47435e5de1f5dfee1bcd8100179553110e4ce3ae63
MD5 0cde844a103d81def6a139eef9e4153c
BLAKE2b-256 50741faf0db9f1b0f4dcd26788d1ab288f9374c399613ead26d7ff285440f281

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c7fb49b4d933afb59b0b09f1eb8d51378d13a6f8d10b6efb7e00e560c6bbc2c0
MD5 06bf8228da760da7c999c7fadea90aef
BLAKE2b-256 7f8519cfa2b8c781df1f10d4a6649f2aba74d0899de39141ae9a175a56333ad0

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a49e40dc9d1cacca00675e96acae35bfe0aa62c0c1a054c37a85f7582ab7c523
MD5 f368b9ce77a00fca9e0a37f58ca97a8a
BLAKE2b-256 e7e3545af5204a3fe8ef3ee41ac449e1a8a37183113132677b7e54b3915cc4d7

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 51ffea92e6c8dfedae20134e0893d4620272d4d55b6383918346ec7b8d943caa
MD5 af1bb25188b2102ad52216ad4f7adcf1
BLAKE2b-256 e51071262ef4aed3a490d4a6e33b03ffd1e3b5d117d30f8edb4773339f6342b3

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d1f1d23f0a560aa4116554343afbb6b514c1a918511090b65824e93aee92585b
MD5 5f0a134637c02d84972f6ebb0d5fc94c
BLAKE2b-256 78cbdbc31eade06ca7f9c2f7d77e79d4c3ce577fc4b48814c1c897698e195157

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.123.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.123.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 7f02dc5e117be7f7db9b98ddca56c7003adae94cabc37ab98576f854f68a55e2
MD5 80bb7201417cecff72cdf2ac6abf2f06
BLAKE2b-256 766f0613144f7bdd695f3cb5b21c5ba2355f2c03ca8896556801a751d1f926a2

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d8484c41ab28a023e2e7f3f8b91f750d285a1ec7b5fe82e9b389b26f87cbf59c
MD5 eb840e483249215689d9f617331edf58
BLAKE2b-256 e93416641625bc43c1a8da896515e0005216bf89e7e27f97ed9879131d1f8558

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 05d03d3807bc07423b32a58fc26ef624d4f8ad8eede35ec4343ccb6c320c58fc
MD5 076de41a2ad85af347a6dfa8938dc2a9
BLAKE2b-256 834ed75d8139365473ba3e571e82af26e23223c422a5e9dda0134e09878b2126

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1e4f7ae56f806d57b9075b0de9db077024b55152b68bda81ffb511473176ca84
MD5 0e104f5a6b649f326cb8943139db730f
BLAKE2b-256 846587772b8c42299fbaadc943be7b05c6f9a9701b18ec75c1315d3cd1bc12a1

See more details on using hashes here.

File details

Details for the file chalkpy-2.123.1-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.123.1-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 437a48d2cc0b8cd353b9537ec903657f22eab3fd7e398bf1cc03e4d5c6f027df
MD5 9483a78fc3a6151509db7dcb18b4f51a
BLAKE2b-256 330ffcb609da5c9984c9b3f0eae8070dc05022e194eabf03af0ac5eb9272fbab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page