Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.116.0.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.116.0-cp313-cp313-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.116.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.116.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.116.0-cp313-cp313-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.116.0-cp313-cp313-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.116.0-cp312-cp312-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.116.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.116.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.116.0-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.116.0-cp312-cp312-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.116.0-cp311-cp311-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.116.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.116.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.116.0-cp311-cp311-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.116.0-cp311-cp311-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.116.0-cp310-cp310-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.116.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.116.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.116.0-cp310-cp310-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.116.0-cp310-cp310-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.116.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.116.0.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.116.0.tar.gz
Algorithm Hash digest
SHA256 ed0a0ef0a98975151c4f6a0a9b9f3791f83e7d62bcff998034cccb0f41927a32
MD5 49fff0745eee8e5a6e8ceaaf7dcecb4e
BLAKE2b-256 fe9293d5ecaae493f2a1c9950114921b15ee2be3036c7b84b2878f24ec49c57a

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.116.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.116.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ed376054d1361963c812c523bb762fbcefdc9d94f6d5b8a9ee76cb6b6abfbe5a
MD5 9977ac5a268bf5f20fab648e8d0d1f55
BLAKE2b-256 f99dbc377e5dae1f2ba7af0e88b3bf28f2d42168e8000e09de80479ccdd50b8a

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9cdbc77a397e30883a5cd472d19997d2a3f69c4610b1af1a07db7b141cc6c936
MD5 f4079063997c3f351c42eb935861dcc7
BLAKE2b-256 c11d2b5adf3e67020ce4bfdc625a60eb8ae723b5d0626133c6ae44e2bca19219

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c5dedf935ad99c87c0cedf29e613bed7d85f5515d479fb0a7c74a0dbd33e192d
MD5 a23b8270335720aceb277a551cbb90ee
BLAKE2b-256 2b18753cdefcf635dd5d7d49772f8a66c8aaa7f42f7488926ed3aa4069951df8

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f737899fc7987817b5a826f5b89d867618901447121072fab3d59c70d98d53f9
MD5 66d7a0544cd3d9cf658d7f4cbb192b56
BLAKE2b-256 59ec9f9d7b1082ad8987833d6df20e84b121fa4c876d9fd686b86ace8adefc42

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 17fa9a85415867284a166fcacd807eb97afbcde31535d31091631cee6802841c
MD5 c1fa82109c7d80d7be2df1718e9cffa5
BLAKE2b-256 3d558c65485cba132c2de032fa2c4f4ae44019105d72878e13e8dbc6132b1bdf

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.116.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.116.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a6133598f59b6297e7761fe5349ae7423e5b6b3ea85ba9107d69afff0d76b50f
MD5 f83fd58705c63c12b3dac9dc5a8f9b3c
BLAKE2b-256 d1980a1f9fc21fe6c007a8dd8d1354a46e60f34f502e0b45899b6972d496b804

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f47ae42bdd007c9c4d1153ec1f305c3e3bb6c51031e82f21e131dee12ca2759b
MD5 bca60f64a10c145e05def40032d19866
BLAKE2b-256 b7d819be78a2c387cdead68926bfa40cfabb2aeb83d137ebb3ab301240b2ffb2

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 db100bcef77ef5ff44ba18721d1ebe25235006ea01f7de6a17c57d7606b91cc1
MD5 58f290146e40c875789807a4646c0f37
BLAKE2b-256 9b05c3d169d88217239cdb5255388d5529a1115f5730f4174fb4c63cb9213ed8

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2be3d0d068b23ff837278519bde97224b9baaa954dbb89746b1213db26b7a9a8
MD5 720565af3bba0a0b5796ec49eb69d085
BLAKE2b-256 3da633533615fde57f2d07c9a5b1b676452b6801090344eaeb1e490d806157c2

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 98140bf4f34d5f34eab6de000b3c1f9ec4407536e8fb9383d99dc1527d1c2302
MD5 16d6edef1ffb1ba46c421826a549f737
BLAKE2b-256 634d7b681c06239a5299dad397e1377beadde96cadaf008b6c4010a8bf18896e

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.116.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.116.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 fe14df5a96a994cbfc7547a893623d256aae89f63a1e866966d00c056cc76abf
MD5 cbd00d35bc88cd61823f37e098fe96ee
BLAKE2b-256 e14661d18f4aab49724071eb438983a3893379ad6944a24b20b1afce33dd2492

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bac70b9603a6d3b77c6fd5578986686a1ce3eeeb9242f8055f9acbf9fffdd7f0
MD5 6a8e30888403fd022d288fa131367849
BLAKE2b-256 5f4a92f36039ae95f5effeb0600d7340d3827612565875418d4aeca981a9a100

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d36b89c1ad85469c4322d285991d2a7839b4b2bb18b98aeaaf983b2377fb1872
MD5 470d3aaca9c1c26c127e2c4c74cf86b2
BLAKE2b-256 70b95d9d7b3eb8b73daf408a269f01759e3794c11911832cfc65e9626b5d32a2

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6a4b63e15768110cc73ee999ff2bb3f10e7c258cb9bf07d7310d862ca5d757e7
MD5 33d87ab5e863ad1e19a680648ed45709
BLAKE2b-256 0e7a3f05738a4749fafb210914efb404b6e8070a6e7710b3e0f9b4ae0591a151

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 22d390506920b3c49b06fcfaf7bd8f50b609a88f0a773413df95898917f9ec5c
MD5 6ef1974f463585f076d67b733a368080
BLAKE2b-256 c1a799bce614187e9dd24e9d0202695d56e9729210e3ef505efd4d5a910747d3

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.116.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.116.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a661f2e53a60234c46fe24bd8a1bbc247b36398b41779c7aba707ceefcaad4f1
MD5 c10a3947a09fd86bf226ba712ed140a0
BLAKE2b-256 3dfb88547788e5c523110b399532b5728231d132f887c8e414ccde780d46521b

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 252b8d116dbb57390ecda1a99082a9910ac8cf7a34ad08582a9074d5d6b77281
MD5 664d10157204476ae600e919f070dccc
BLAKE2b-256 bdc810f196651f5959abc7550e48f15957fb5c4cf7ec47cdd28632955f679e98

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8b6d53f6750b220fba9e3bc20079005ccdf4d5d67f5ed2b337b8bf5adaf3081b
MD5 ab2e36bc976f6f0df8ef0497a8b498d9
BLAKE2b-256 cc4325d93966cab0d7ae5c1049bf1e5f0ba3b65ebb60c7b921a7da4c7d715b2f

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 427d3163ade587d8570dd9de73dcccae7fbe3aabcf176615c3d8a63b4161b92b
MD5 89490948aa30abbbeeac9373b79b4fa5
BLAKE2b-256 793e0de151a9b6497e474c1edcf96c361a3c4ea38c2f5696db61f1e9c2caa3bc

See more details on using hashes here.

File details

Details for the file chalkpy-2.116.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.116.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 14da19e9d5362fd124b2c6554edebd616c563788398df8fb3e00a7b9971e4a23
MD5 39ff98b43d775a26863bdb12670d426a
BLAKE2b-256 063e1ba762dde70a50c5c7c065fa322ff794f7121df2a70cbeba5624b6e0cebc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page