Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.113.8.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.113.8-cp313-cp313-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.113.8-cp313-cp313-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.113.8-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.113.8-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.113.8-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.113.8-cp312-cp312-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.113.8-cp312-cp312-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.113.8-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.113.8-cp312-cp312-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.113.8-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.113.8-cp311-cp311-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.113.8-cp311-cp311-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.113.8-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.113.8-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.113.8-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.113.8-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.113.8-cp310-cp310-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.113.8-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.113.8-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.113.8-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.113.8.tar.gz.

File metadata

  • Download URL: chalkpy-2.113.8.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.8.tar.gz
Algorithm Hash digest
SHA256 d7a428e44ddce9fb0d3027583e11092ca56fb6b463615ade5990ab90a2f5e3fd
MD5 19fe1e2882d254aa3f497a8a7dfdf64f
BLAKE2b-256 96b3ea80af3c153facd2fdae192814ca9cb6d2ef25eefac870d32e795112068c

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.8-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.8-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 d407dd5568f2f5958fc332bf33c5247ff1b6558e9ff38bb63398809de76331ac
MD5 18399a7589eca1790e221dfef74984f0
BLAKE2b-256 29c44dadedf65e518dbf8a75edf84198995fbe8f0c94f1f1ed0b5b09ed1fbaff

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 219d599d4f8173dfb720d8ff4d3d23e43a6e1d57ea2072d27ce8116e4efa7888
MD5 ba8acb26c8dfba36f4c9630d5f5b0abd
BLAKE2b-256 7dbcfa92ed5a71922217fe910362fcbb4c478370fbfd1ea551af6aff0b333b00

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 26a5a76469d83a4f68e92cfe58890821ada169abc6b8b9bdc6c272fdba81421d
MD5 35040efb56055df97bb1306cfc216fde
BLAKE2b-256 a75ab7f05cd05ad7e6aba226b1393c5dcb2f768f060e74819a46c9e4b043a5ad

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b9676e7fd8e635cc035a5b18ed11b27823e1757a4990ef240f2e8373794c16ca
MD5 46c47c4470757ca817d3a0d9a4e61b15
BLAKE2b-256 3fa2fd3f7ba17aa8c91ee8a9b51701793b42b97629708aef02830162d4d427d0

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 760187a61a4fc1ef978b6195aa30325355a8d0576e248d01a9556772a0ada665
MD5 568c8986a4af4d59d62c2c224adeb585
BLAKE2b-256 1401d88c1000ca75cefe1baca3907fa1d6dd6eb7420def0dbda541b643a32794

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.8-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.8-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ecbd9d739d0e4b9c19d10042fc2974bd0e478083372738a672d7b54895853458
MD5 92c6e50ac3b84e6b441b1dc4259599bc
BLAKE2b-256 f73983288e33d9e2f80a129131bd04a776cbcbc4f978b3d9c220361a5a9952eb

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1e9d3076c9d8af31a501f022a1c99100a00a6eceed1af22f602626a1c89a0a31
MD5 de4224fbf141f8cac803b35c95f9faa3
BLAKE2b-256 0e80df49df9076b877da91b274e368f3bd4e65e72af5ddc225a12395696ce1f4

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 408a5458c798eeaa5ca1f55e6a477f814caf7119676bf0321bae7c06c4b6101e
MD5 cee832e41dd1b685a4bc67672edb0b1e
BLAKE2b-256 5dada6d6a5c9b7528ad3fbe16b3620aff55ef1ae80a1bdde1c3bd87c0f8e2a94

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5f941ad9f25feae504525c81b9eb751daad77350589f670d003914c03dab950e
MD5 3b77589fa2c7d740650d7432848160df
BLAKE2b-256 8d5f52b9f93de877d2bb433caa5ca751c542c4fd470019ebdbc7313c21db3c74

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 eb50cdf0c05ec4d2477f74d6ae2620c712a84968146e17396b296edbc3bc33b5
MD5 1ef39c369c35c1f31169b13f014708e2
BLAKE2b-256 801491d88acbb2aef47fe38d7576317b6c916dcb83d8378d3add56c68018ef37

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.8-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.8-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 9ceab3eb0f29976048e248bb39b29f1d65929ce4f63006425de3251696f6d89d
MD5 c0b6c2e9265d692ab908dbb3f62565f5
BLAKE2b-256 d19eeb2b085b1994049a351e2fbb0a8ad30751a84734d1af4719912acdba8ede

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c72fe12f67520269e0ca9d54095460a7ce2b157c1c3802d04e5d74873d58b2bc
MD5 78fafcf79049cf82f1fde7d47bb38fef
BLAKE2b-256 9ba918bdfd2b0a960c303012160809b7d76ed09233456179361b7fe7cc9dcd6e

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 74acde67cd7483915194ca6a8f7671ca40aaef899558a49b2a29e9ae528d87df
MD5 9ccb40f0aa2ff98756a089ee25fe4e69
BLAKE2b-256 ea110513efd64320cd28b03890d24019b3086b413e0ec98d8917c38670cd7417

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 38118ccca926e695f98e0789ffa9f3db5d842b3085859bea9b925664d1049fcd
MD5 2ed0815b6d127050b3c94ac8e2b12ec3
BLAKE2b-256 fc6bfe6d181719a5bdbf44f5e7bf1ca1af3d22de7c3efb2fa69dcaa251e6b18c

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f4ab1702c601656d1526bf4cf5e778aee78f066818dae7bbc320f5192181060f
MD5 403b630550a2655f6a1ac9c0bf3b8e59
BLAKE2b-256 e781f838dad02862408f9c83d671813a266e2db8f49967c8e18318289aacc270

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.8-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.8-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 798ea9ca5bad17502dffe4ae06581f54c856a8fd6adcb4a3b64d6eb1b988e9d0
MD5 47e176162ef43e4496df6314a6fd4f71
BLAKE2b-256 57b67480b57b94fe2608287b320fcbb99697f29d7c19885a12d5af977ccc06d7

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3d79c8cfaa9df6a53288a7cfd37cefef4b8059d29a63114a7a05a727de904c65
MD5 3468fcb671cb2a704ed9b2640668f03f
BLAKE2b-256 4b180fe39292ecf26c8ce6514b7a62e7cc7c75a4282b3b762a48260c7a64a821

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 2d87efc0578647d474bc445d3f1ae9c3f26465df6deefabc0bbc12cc205fad29
MD5 7b94b19fdeacc2d2c1fa9407a98715d7
BLAKE2b-256 f871c399a98590293bd32d7ff6d6ceab7cf594fe615ec67a76dcfbc6a0f70ae3

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5abf792aa35a64f6ed727240d44ad8f5206d00ae93f27e25ca2e6c5c39590194
MD5 efb6e4ddc3e86683c6b60f001022b396
BLAKE2b-256 732574acc0fc46fc64913b6fc7033d802916770beb8a9dd4b155a7b2ff45bf31

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.8-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.8-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 34a4c683b963f5f1a1556e3cffabf3caca11a0ba180b56205d8d4d2c533c7e80
MD5 74d6f34072df95f9a9b5e5a79f7d4097
BLAKE2b-256 e33206f2a822dc9e1c75969ad1216cf602867285c27d75666d4eeab6de415727

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page