Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.113.12.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.113.12-cp313-cp313-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.113.12-cp313-cp313-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.113.12-cp313-cp313-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.113.12-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.113.12-cp313-cp313-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.113.12-cp312-cp312-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.113.12-cp312-cp312-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.113.12-cp312-cp312-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.113.12-cp312-cp312-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.113.12-cp312-cp312-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.113.12-cp311-cp311-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.113.12-cp311-cp311-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.113.12-cp311-cp311-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.113.12-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.113.12-cp311-cp311-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.113.12-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.113.12-cp310-cp310-manylinux_2_28_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.113.12-cp310-cp310-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.113.12-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.113.12-cp310-cp310-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.113.12.tar.gz.

File metadata

  • Download URL: chalkpy-2.113.12.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.12.tar.gz
Algorithm Hash digest
SHA256 3526e57ba9bdb00e46d4690089b26c33b56396151fb507b34082126ab7d8755d
MD5 c2d446b94620ff1749167b66aa42b752
BLAKE2b-256 83589195ab642aafaeff43bde951957a8914c9b1b1c09f483203b653c8ae04dc

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.12-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.12-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 849849a27095f9790cb921290cabbe5d8440a9ece34325f85d4b228fdb6f7083
MD5 29365c05f3a85a900000f77bee535167
BLAKE2b-256 c938ad520f675300515a607e70eded2d121925f08aa347ac33fd0f10fd66753e

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d567b733e7c053319eff4f48b9cdc06bf80980851bf21b358a7167643b83cd55
MD5 fa75aecae154ef84890022bc2892d42f
BLAKE2b-256 fabf1bcc71ca5fc3b19f05234f9823d1e155572eb779091bfc2d73da822d458c

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b31aafb32431bafe564269443ef50dbe8b96668259e12fb6f6bc418547e50e52
MD5 b4d9c50f124117037dd4116e9d676db9
BLAKE2b-256 b99fc2a809aaef9d92bcfcc3f9d1a623b0ba6430b4cc5254da0d67d14df68990

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 70aad3bbee774b64e2db582e6aa5691ff7f03f82df43c99cbb1a855e15861545
MD5 80791a7b219ac7a5a0a72901b4d404b1
BLAKE2b-256 ef0fd04a80abc373c155ba0785ebcb60c49acf7e51cbe9ef2144be4f87255630

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 57143c172f14a1d252596eb2b518db95ccd48ad6517e2912c209d3f51070a8fe
MD5 a5f66d8350269978d8414711ae2b0eaf
BLAKE2b-256 59bb8cf90ec14070258b190a5306740d9fceb50b62b079e881ad1e0223b3007f

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.12-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.12-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 391445adffef531dc7113a5ac9265c9ad158843c516fa7045ed1ac393cade8ce
MD5 7fe6c253fd5ed94ee2dc89e2d11edf38
BLAKE2b-256 b19d6c8257e27c40347187ff941f273adb972ec7ee9b4b994c3f0fc34cfab40b

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 fb323b8ca4b0134999603d0a9e169c06c319a2d5c857a77c9b8fdbaacc6a9e71
MD5 33884ac906ee2c9cb14a895516346346
BLAKE2b-256 0ba9961a7c4bedd0d0e1bfce6a411d540595f488cf6cbc16b4e1800cda00aa32

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 680fa28c92c9fbc435fb4d0f8c3547520bcef1fca08a9bc615ed485892c91d5c
MD5 db2fb9213cd6df2c135f9c69eefe5afb
BLAKE2b-256 76bacc9fb4d3835b321cb7ce9223e13fd09f8067f66894946b888295c1644bf1

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8005ede6a656f10f72c8196d42ea1df242981884a6c4f9763502ff90395aa838
MD5 57c7e0f457b28c1c1b4d8fc2786f67f3
BLAKE2b-256 2a4d542d6b4d231fb3b2228c49dc048f5f7dc717f284b21877340fc2a4af34e6

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 a799eef1974191b07183c58d883cdcf3320f0e0463c771e75f8e3390d0cdd6db
MD5 38925b2c1853cc0e753821078e8e2c42
BLAKE2b-256 62c988214f1420dbe78b7fdde1325e979cd35ad4b643ac57fad93d36bb695519

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.12-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.12-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 315614a7ea021fc86d1cfef5fe346611da068f18186d363389ec6ba3b9ba1631
MD5 27f6384a035fb9f11e643a7dff5bb5b3
BLAKE2b-256 c6d608bb67e197d3f934aa2a073b1b023f0c3ef104223b4982243aa178efa711

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e19c7ee64b115d819a7e95d5df5a64e21b697ccf9268aeb3ce055b4eb12dcbac
MD5 bab63d29156ec8d007c6433fcb7391ec
BLAKE2b-256 d80ec450ca26800cd1c6192b2899ac0049b0594384fd17068d71da974ed3fadc

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0f288869c68b403bca04be7ce1d5e40b4b285d2f54b721e01c2c8665b0517222
MD5 61adaf774f4e73eefdb885a639118980
BLAKE2b-256 06a6980d96ff82f0d245137daf658934d7a081e7c277b883c15161555826c5c0

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4a5ea4045e7c3104b7d5d35d050fa1142ea604429f46f4716eb9cd2684b6bab2
MD5 6c5988f0d42446b4299290d07a310192
BLAKE2b-256 2099663e59ddd194718127828dd1b599c36e21b3870521a9d700e8cc252cc48f

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 2625164ed274d7ddef694710a0337e34beec8e129e0445213bce083f98af75ab
MD5 001e678c15bb173ce475f777bf8eb571
BLAKE2b-256 81ccff73ffe30339ecf5dd8953621a45c86e32246fae12f84123f5d27a3f1470

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.113.12-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.113.12-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 5d5d5c14135772384fdbeb4ee85fbd2969d955d5e69d712536c0f52ccca34fa2
MD5 0b206ea529428413cee7455154a19a97
BLAKE2b-256 a131ba58c51fbbf98abf43796157ba34e06ea4c3b0446756ba2254a6af295278

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 17af8074787933bddd83ff0abf90223b3d5897c031c5bb991945ddf6fd0567f7
MD5 4b10a23e130d149b37fe4ec816d8efe8
BLAKE2b-256 4b344206429ebf78d66e101d347cf32723b0853eb71cee027b2f49e01b4d8b92

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 09349a67e40235cf3cd92933048dfc58b32ceedbea3cfd555a8058bc1d5ba392
MD5 b85832ccb67b80a6f28c81a048494245
BLAKE2b-256 9b26ddd0a462e5917157a90b57c87e27b9766edcfa7496f547dbad1818148fe8

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b04518aeadf696ab3565e98ffc2e42cdcd628f6dd4df4efa7df872c16684a4c2
MD5 b5f61bb146a24551ce560b20c3d42ab2
BLAKE2b-256 6781ec514b4433376e304e4359207dacd4f53d88cc0cbb0d62953533487dd80a

See more details on using hashes here.

File details

Details for the file chalkpy-2.113.12-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.113.12-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d09300474435f0467991ef3d8b0a220c6f6e76ec61bfe7b41991e2eda489b58a
MD5 787dfbfa0634e834fe7006322d630132
BLAKE2b-256 a3721983bffb3286c31ba6828f334775f15f84d7be74cbb0ee26683ffc8ccd8d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page