Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.113.7.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.113.7-cp313-cp313-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.113.7-cp313-cp313-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.113.7-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.113.7-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.113.7-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.113.7-cp312-cp312-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.113.7-cp312-cp312-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.113.7-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.113.7-cp312-cp312-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.113.7-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.113.7-cp311-cp311-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.113.7-cp311-cp311-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.113.7-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.113.7-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.113.7-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.113.7-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.113.7-cp310-cp310-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.113.7-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.113.7-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.113.7-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.113.7.tar.gz.

File metadata

  • Download URL: chalkpy-2.113.7.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.7.tar.gz
Algorithm Hash digest
SHA256 93e8850ac20d5d10073ed1c147c67a52656f01cb851750fcb7ccb0b246dd7899
MD5 9045eef8da4ff2c56776af1889c0938b
BLAKE2b-256 d94a8a3f9b18557d3e9edf9f2100ed9c1eefb77be3b8d0bb753ccd88404c1a8f

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.7-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.7-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 54830020c03c0c2d935b032af29889f0bdd162a5a1fbf0913d3402b3e08d29f2
MD5 d69ff7c6ee3d941b72ffb261cae4b054
BLAKE2b-256 19019e96da26ed9498ba89658b770b801e18f85bcb535418f069e8af6feb9466

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5914dff3852f9c1fbbe31d407efa846d4dbe08f99fd38e774536398151cd2ac0
MD5 39b8c9b5523a9ddec7a7c43ce146186e
BLAKE2b-256 b8658fff0b778c6ab26a5beec544b8b0a1a2fa0c2f61af398d668e8c3302d505

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 70442bc52bf7b44602cd1d6452a298e9147da974e939d1d90878627b513c03d3
MD5 d9ab86a52f5b5435ab96a67f3bd64cb6
BLAKE2b-256 27d66ff3c2f49bf8e388a29a995b4d446933224a821919d7b790d618688f5d5e

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 001ccb86433755e387e3dad5b77e7a6997d02dfe73f23ca9ba35f458de8f47bd
MD5 9617af07654f0235cc85879321de24a1
BLAKE2b-256 c412cdca9540f2c5dbc125090f4d227af8bc38ae02d12b83a9b99e23ecfb3cf3

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 3fcccf21fbc5d1c0a5b7d15a120e676a7fd670c24eab23fe7708f559485cd14c
MD5 91dc0c41c76b263858b36ce9eaa8758c
BLAKE2b-256 0c2acc0bfa126bfa092df1d82e82028c26e6c7cfd8aee94bb01ebade27f42b5f

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.7-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.7-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ebb0b2225dfb5edc84c8d825275e262f4c88ae12fbe6e21d3b860f3d2cb5a569
MD5 da6f2690f2a309b45acea94f69531ed0
BLAKE2b-256 0f0944922fd04eb8f0edead6241dd6684bba2db7ee94df084ec219432c74170c

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 493296c471f666eb0fe5d67713b0ed2fd0ff32da6c6e6fce18266c6dc30ef83d
MD5 b4c3d56c1dd3218c0dd6e2c31523512c
BLAKE2b-256 7a872a24cf73b726b34be2633201a2866c1120863198bb580eef1813375455a0

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ce85679d626e27826871e20f88f0c1561bbb30fc20b328bbaf82e65e436c8c90
MD5 2851fe301ef3dcf37a3d01510ce17b03
BLAKE2b-256 f4ab4b97188a0029eb3c85a477e213d3fd965abaa34bf7ead224a760cd4e8df9

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0124a4eba55cee93cbae8424ac98fa5f4266fe366d31a9a88ca45a283ffbe2e3
MD5 e33d06e9659f362197284d3ec69f2910
BLAKE2b-256 2fc1b6629421f7f41e6e6fc76238eca30873e3f07acbce068fe865a1018c679a

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 052b772bd05af9bb0b0e061b0ec2df01083f1b8abe4a1b27b98fa39e086d45a9
MD5 6fe8214c073eac959b531dadc20d767e
BLAKE2b-256 ef580c361df16d79f6a2ecb613df50d8acac241a0c6260f58972c3c704d596fa

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.7-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.7-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 9d0b1730eb2dc53b9ae3dc28608502fad7f4fc0c469e86eb2a0dea248da999cf
MD5 30b07b7ed0b3722a488efc271215bee4
BLAKE2b-256 46450176d92c73b45880c66a0b0238cedfac5103c33f654e974d8be6ab5e0ff5

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4e6f4daa4f0e9bb2232d3270c14ecfb6981a2be5282c4f0c6cd6386749c9b531
MD5 a3fb272413338aa36763134ef0934c3e
BLAKE2b-256 cc0612aec90b943bfffdab7400b9e2aefc504f37e0163cea9922b59fa7dba1e9

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b2d868a7d35cbd4691bb337f11a77d218b2c01113fa860d6b0cf58d4e95669c8
MD5 3003616f3ba89ada5eea945b6988fa24
BLAKE2b-256 6c30fa63871db358c6fce6dc3e788998c4f92df632cffa5ecd8c3309ef6520bb

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 26260bf2467cfe97b16db5b3371c3f74d96c4ab6c7023b4ae9067cfa2fbb6a1f
MD5 b6cc39c389e3041d48be01751a9c8072
BLAKE2b-256 b7bffe5c7db56de10dacab6b98ca317e34176de40bd173e798fe53eada980db6

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9c863d73eb6789099aa58f3503200e19099fd5bbfc4bba633593891c6112978b
MD5 a6695113b50ab33be75f7bee423ccc1c
BLAKE2b-256 f65e0594f1d1105fe09747802d76102e6021d5329ff46c00be4c2ec1ed2ecb34

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.7-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.7-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 1594708c3bb09c2db9c96fb6fc86019a60afe67d368607e38d7c3952916f38f9
MD5 b9ba76bc8920bd5a2811330c3b1b1eec
BLAKE2b-256 591832e1aeb1b55dcb307cb3aab9b6da2bfa9065e6b506b73bc9661a254c77d3

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1c095a7a864e74375cff74e708b402139f98db233dd37ac42c61d1684c6efc22
MD5 574aef62ffbf172913999357200c6ceb
BLAKE2b-256 24ce5fb0b12153c02a25ef0cffe6911a4658746e21653e5046d02b6fa4801316

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a781809323f67498ae118507470882ef9652546be927424256b72fce99ca7693
MD5 dab592738405eff73070f552b2ab0859
BLAKE2b-256 f3d1059ddf53ced35b34bded8eb2d7e9127c5e495fb89f7371eddeced753a5ec

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e823fc41a8b18a0cb3ce6d0147d703e3ce5f567d5c79c4e1085ef13698b442a4
MD5 2abb9135d0920455b82a9b89f6c3401e
BLAKE2b-256 528a567687960d50f451123e6f4506d3f176b926306c48815fb7f15059849039

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.7-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.7-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 fb8e768c612f5b1e19475696ba12ed388194b39d4110592ca2716c3066d5e966
MD5 073e5e63ad22a8d8708f436a90c4ff76
BLAKE2b-256 5d976e39ca286a1de7762248eff30362e1b75f501b0285ba96a81ac5138e3507

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page