Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.118.3.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.118.3-cp313-cp313-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.118.3-cp313-cp313-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.118.3-cp313-cp313-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.118.3-cp313-cp313-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.118.3-cp313-cp313-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.118.3-cp312-cp312-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.118.3-cp312-cp312-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.118.3-cp312-cp312-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.118.3-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.118.3-cp312-cp312-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.118.3-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.118.3-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.118.3-cp311-cp311-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.118.3-cp311-cp311-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.118.3-cp311-cp311-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.118.3-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.118.3-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.118.3-cp310-cp310-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.118.3-cp310-cp310-macosx_11_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.118.3-cp310-cp310-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.118.3.tar.gz.

File metadata

  • Download URL: chalkpy-2.118.3.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.118.3.tar.gz
Algorithm Hash digest
SHA256 c912f4eda37db27e10e310e8300c4aa9a9415a9c9e8cc90371c878afb4f18e3a
MD5 5b3ef281076c33a859a0abf5aa3c40aa
BLAKE2b-256 3a1577d9081f906c136a1b3cd4abce37e33c2a046ff77a8299ffa635b4c4e0a1

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.118.3-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.118.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 9bff518b7b5e5dcc539c98af011589f9148b8b5fa78e80f935423ea8c1f39941
MD5 c89e5f392bc044616fc6339a45df12fa
BLAKE2b-256 fec51a12b16472df1a59856f25e73b362f41ba2c89364d8a15263826ca78c68b

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b814283d848fd48e763da440ad480c6b69ef1a3bc4ac9361314e81979b6a68b1
MD5 1dcf67d77c5ba2c68be299f65c02e37e
BLAKE2b-256 762cfa8756659a90976beb11cec5959c582c7f0578098a853a12a4b08fe21745

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5b55f0c2613a7db2e7b9d77b56218bc0186a85b6177705a8b0081cafad2759e8
MD5 70016ccf77b68101d4e443db67f6e4ef
BLAKE2b-256 7b228090fa702cbd55af1861ff95e1beec05abddc7d0d1963a9337ce574ffca0

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9d648802cd5300a9e06527bc075b9fed9a6e4a5857319c06f4e12d336e49261f
MD5 f144c21f63b6f2bd88ede6c8c1669b5c
BLAKE2b-256 9233649fdf3e0ea7c80243ba386d39100f8f0b8a3eb75392f4a46718adb009b0

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 a8b4dad46f8f091e0a6e105a588ee16f419b76f2614315aa9b956abc0832080b
MD5 6a541a284bad839c83dce76f163f9408
BLAKE2b-256 722f010b6cd3627b2228cdd2a37d495dc6d6d6c01ea1ead8ac12a0078fc31fb7

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.118.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.118.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 eb490ebf3d9a6e3922739bc6c3564771bbb243d0dee46d77d3ba48d1a1e00cf0
MD5 8b5355dd72e288985b57e9a80f71bcac
BLAKE2b-256 076d145e822be74098b6d10491c2f60a0de4e5f82d79ab7ec34a01f3c0ab4db9

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c0f1284c91c0e426145cb89ec381a975a9c7f6fa7480763663e9b2ce1d029c12
MD5 d9b03a0516b23cbe7a5f14fa8c132e8b
BLAKE2b-256 93666448e1ef3655e096e4ef85dcca3fe2e8ec6153a82701f81c234648e19afd

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9897e59609b4851e883a4b4590e1565327a65e0c5be2b34072ec72c0bbd71413
MD5 c00dfa251fec5b0430ded02565eaef65
BLAKE2b-256 3629bb9a1d22f3d2b22422ec1af79fdff4bd25b13eb55ea9d18fee1be93652c6

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 be26740ad73dba66ad1fe60ef6034a306d535195c34d961ce4ddc92f81cd586e
MD5 1fae629f5b0e3a46db7e2401ddf51a80
BLAKE2b-256 69cbc674a1a455ac42e271132f15c664a48da271965ae62aef0b93c4ac90ba50

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b4278b1d57c2f681fdb80f28cb4938d03cb6be775c210a0145ffcbf66a5c9727
MD5 d2a65c727b6f1423ad0f5807bf1f0438
BLAKE2b-256 ac495303c6b22101b4b8ae5948241fdde63ad8723e26b24650921a97cea718a1

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.118.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.118.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 7a8c08a521715e6f2b45bea1e9a4228b5abfcfde55788f2e26879227a24ecf47
MD5 ca4ab67f29091d61fa25073c573e8a07
BLAKE2b-256 222794baf63266bcd34ddd4aa01c3062532cea88a966e13c592dc7ca1886affc

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ba3196673135622e16dbbec355717752010aa1f6eb05191403ec69d82b39b069
MD5 197c45307b3edae6bbdc22f8f4757b33
BLAKE2b-256 0d37d77a0595d314d15bd25d0753eca6f3acb1ccb981977d842db033df4ccc39

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 87922a329814d239a8f1162c01aeb6485a0ff24cb14c1ce0ac9db342b9fffe88
MD5 41361cc742ff2f2310869f8ac86c7a2f
BLAKE2b-256 f6ffe3949ab539f13fa0559ff355caf0b5eea57b74f5bb44ffc105827a454705

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1b70929cb69976c60d9c027f2f908b02045f499853c8e5ea787067e1d0068399
MD5 0573370730f94648515d553e02b058ca
BLAKE2b-256 b7587216286085858cd3a4eea5e55449d0376ae3493674270fa86cf46355bfba

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 85f0bfb745484257117412dbf7783e23454d28a10b9ed22343e14e952f16bb1d
MD5 47b715f4fc714d5734f21de6fb138950
BLAKE2b-256 53b0029e88f2505842f3b389b97dd41950a6372790ce3b57e9520bd1e7593b67

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.118.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.118.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 cb7a1dbb3b3d922955e2b4b774b7c43316fd5023874455d43a7b3d6740b7c049
MD5 68e3e9e9c250d6ae2715da8f9b2dafa2
BLAKE2b-256 987ff166eaba71d681d059a2cbd16e7d9284c6f3b579278537b5c480e0051bee

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 268767d732dc86464aae3a3e4346289a2b7a3dafaf8e9c0e40493f48b3e43cf3
MD5 c91d356870586926c2935cb20e17fb9b
BLAKE2b-256 b1335488158bac94aaedcb52dc49a7a06bac2aa921081c703fe70529cead2c0d

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 2d6c7826061ccc9d0bc4544c35f484e207fe232613326ddb51a7aaa159f26933
MD5 2a0c50d1471a8b7ddc0878edfdddd691
BLAKE2b-256 ebac80dfac0462c2e69812fffa6b73193645c2f03ef6c0f20a924fe825cce4d7

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 225cac8c141f7c304933caab428836bb84bd19c93a349c1dde62573f7603f67d
MD5 ec13c6cffb997c87a6ec593f438cb2ab
BLAKE2b-256 7486c45d443a777782083891011198d35f25e9fc2b4d870fe9493188fba12654

See more details on using hashes here.

File details

Details for the file chalkpy-2.118.3-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.118.3-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 84b25644d1c57dfacf42a2d809efa1ab289b69ce720db3b98f9756b4557f2b5f
MD5 0f5b91646cfe3e1a3825cb6641f94d3a
BLAKE2b-256 fa2140b73b56a9efd01e0d1e390b0e97cbb0ca00fc92a871c733a50d1bb1846a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page