Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.113.3.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.113.3-cp313-cp313-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.113.3-cp313-cp313-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.113.3-cp313-cp313-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.113.3-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.113.3-cp313-cp313-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.113.3-cp312-cp312-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.113.3-cp312-cp312-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.113.3-cp312-cp312-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.113.3-cp312-cp312-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.113.3-cp312-cp312-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.113.3-cp311-cp311-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.113.3-cp311-cp311-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.113.3-cp311-cp311-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.113.3-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.113.3-cp311-cp311-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.113.3-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.113.3-cp310-cp310-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.113.3-cp310-cp310-manylinux_2_28_aarch64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.113.3-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.113.3-cp310-cp310-macosx_10_13_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.113.3.tar.gz.

File metadata

  • Download URL: chalkpy-2.113.3.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.3.tar.gz
Algorithm Hash digest
SHA256 318097e5286bdaa5e413c6337c93747508df3fcfeb905a3a63e52afa977f1f04
MD5 2f95f3e439d9f948a97f52fff109540b
BLAKE2b-256 b6798c17f77c7f44fc37209e631009b133e444891454e37e9c8e110eea0ab7c0

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.3-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 a8df8251ac35c87c01a0e2e6613364c1835e6f543e9fe068a2339789ff986784
MD5 d9575c0e849d2a7c0a03a6e705292722
BLAKE2b-256 40668e4715a0a7f873ffdefa6745842378edaba9684b7e5011a4b80a2b47ea11

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 592f71b6a6e1606eaac1f4646ca6fecdd08ec936bd3d18455ee209c0be2c1f5e
MD5 7a0f1c9808ba1b7b32416ade7e3ca199
BLAKE2b-256 b192a8e6714b70061121959481247108bf3dafaad1a278e1757c32342d742c31

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e40b3410ad1f127daacfb71264e5d561e7c0e5532ba696aaa241204bf74bb8a5
MD5 3aefd412e202a586ceeeb523bbad2caa
BLAKE2b-256 77082ee6c41d357c8730f0812e2ae1a47b50dea7f795e9e711fe9f8661e248dd

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eae2f5e9347827a07011e9a0c54130fdaba6b72eb9d7d7d4da9d6ac72c17bf09
MD5 de3f75344490b6f05475fe55bda2f1ee
BLAKE2b-256 10b58923780632498fc6b74d68d8785ece34f9ddb9a972041d52957c4f87187d

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f79316988ed60646f30464b2029d301f70240c712dd88e4f764d06b3a4a98df6
MD5 3fa8d62f034be0efb8277ad7e19dab83
BLAKE2b-256 d7303f193d9ba366dfb281c58cddbf927f22f0be179b7fbabb0c3537aff7da1b

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ed006e1e159fde37d0c77386927f03b8031eaa35c2738b798b0177104f5d2272
MD5 29340f66b065c02aa75cc43c5d3afa55
BLAKE2b-256 7ef0abd00c47a1b69b7e5a8f415cf1ad939472d173b768dc5b57d228f485b6ae

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cdccc43a8543d8df23eba3033808f39c0ab54a5be84fab473989ad322b90a735
MD5 a742266ec080cd32d05f7e9da9ebfcf4
BLAKE2b-256 31308344deed468cc8851b0d033de471d5340eb845d3549a588bd5cdaa5ce7ec

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ae8bd1b3d9f2b82ce698462d12602eadda3ea1f11ce5f983d7d4583c4aa4cf2e
MD5 c2ccdebfee2d96a61ef94b74f520d053
BLAKE2b-256 d2b70c033c45fdb0c49ebd32adb035540f6909e3a572d08e226868e1e58d51a7

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2b7ae9321fb09ef83611d6fe52e9db9fa6dd59f02a1bb3b6133d0c2daab7090f
MD5 552195f9d4502571767c9a11a38eff70
BLAKE2b-256 ab6225649702aced9c8d7a42e677203055abe797b26b59842abdc0b86e413cf6

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 43ad08b8fd1e2c3628e93d3cb4dd33fc0287d8e804a769866b218d72a7cdb8c3
MD5 8061dd71c8e33605bce88e848637a7bc
BLAKE2b-256 15548e28ecb497e732b30c531181528b7f115b1bfebb0da974a072f5fdfe1db8

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 b6974f9d58fba2c5071d34d1c1caa7bb8603a2657ebd33e49620a09df3fbf65b
MD5 9cd388e35933299f8d75002c6f6f9d6c
BLAKE2b-256 cef231763a023cf2246efbd674953c56ddbe78c1d1299f9d4a81a59257e35bb6

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a5d7038b186e613ffa0a53418bbd3d5024c8658aee21cb89a7d956d78a9523bc
MD5 5eadcf28d63cae690089fa548040e729
BLAKE2b-256 401e913f1e19a60aa595ac7d53caadc893a3b7337b29e2c90027f3fc18bb5261

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3c5e49fc6095e8b0ee5593823923905f11b01ee32ac991293fc30af955c3944f
MD5 29369fdac71031a66180be463d33b0a3
BLAKE2b-256 c999ce7248aec9efa4cfb9442b6362e5da086722c0948ccfe4fbd7379af535ad

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9af17f93e89ce85bf4b5a88635804a1afc7872058555f8996f4203e39551e497
MD5 ac0b5bdc54eea4c1988885a802ce1c8a
BLAKE2b-256 0164554820b45b06cdc4c0a89b69223581ba951f6abf5890a6bdd36cfac904a2

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4a5cb91f826d0f836d51fd1e42f4bf5f7fc1c29c6915fb1b56e98c4ac45a6fef
MD5 d6adcfdfa83a0118224828bc21a1780d
BLAKE2b-256 8079ad47418914a50a8f0527da88f8d94cf044224d17038f2f3c3e0b41e58eab

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 d58c2f3fae5b2a99235f406c4d07b4792295576e6659a9dc9381c7f8a453c24e
MD5 fcf199c57a8392f4b2def0e3a1d84fe3
BLAKE2b-256 047c014683edfa89f13a405c0a79d02c6034162c0284f61aaf6eb6bdfe56cd19

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 45b6fd11f2477cc393828689bf74ee8746680d43a3700cff043d76cb46f468b9
MD5 27333f9a89c3ace53353c1dad5e67d51
BLAKE2b-256 9806ed772d804c28ef6a0bf06dabad32422ea6ff8c902e81805c5043673c89dd

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 84f7ce5131b92ebed4297b2ae8b689ee8b95a8cca7304535cf11beaccf67e5c1
MD5 7f473c3e9766a8911763abdd4c68797e
BLAKE2b-256 cbf013dac46fd6d49edf2732df5932f0e139c7de2208d888d89a6aa7c3acbf47

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e479df464ee2990e3d4259f95cbb9a9e6863b5adc67b4c7368eb5012445215b6
MD5 f47239f1970bfd9cb0cdaad79274a094
BLAKE2b-256 fc572945901b2a898ba18e0ebdf2bdde08fa292549dba1ef7cd5dc0bc3de787e

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.3-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.3-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 209209cf76d30afc03c597e6259298166cd88a4b676f61ec3f786248bda043a1
MD5 055cd336e12c390aec1965d28e3b8103
BLAKE2b-256 555884e8a2b0b114daaa9c18de9efb4ead10bb03cd39f5940f1469ffbdcc511e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page