Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.114.2.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.114.2-cp313-cp313-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.114.2-cp313-cp313-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.114.2-cp313-cp313-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.114.2-cp313-cp313-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.114.2-cp313-cp313-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.114.2-cp312-cp312-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.114.2-cp312-cp312-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.114.2-cp312-cp312-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.114.2-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.114.2-cp312-cp312-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.114.2-cp311-cp311-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.114.2-cp311-cp311-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.114.2-cp311-cp311-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.114.2-cp311-cp311-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.114.2-cp311-cp311-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.114.2-cp310-cp310-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.114.2-cp310-cp310-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.114.2-cp310-cp310-manylinux_2_28_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.114.2-cp310-cp310-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.114.2-cp310-cp310-macosx_10_13_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.114.2.tar.gz.

File metadata

  • Download URL: chalkpy-2.114.2.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.2.tar.gz
Algorithm Hash digest
SHA256 f31de414b14ca6a99aaa076447737d39bdfe28eb5428d359c58464a3b8fdcf07
MD5 8e83f44c6c10737a20ff4bb02cb85c5b
BLAKE2b-256 25b48a670680901820f52230c350c9dec1a256aec69d23f775e01e71bf3da347

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.2-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 183c8da2fddb26ecab67fd6103004e242042940a408baf002907e47e42737bef
MD5 42593867c9e1689c37154263d55da952
BLAKE2b-256 151c1f6bf4b10d47fea08abdf435e7af446995de690e0c16681c9bb36538c441

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a09dd37bfbd5b32fbd31bbaf2d2a00f7399d3903f8103f6c60f46a5c7a8b2308
MD5 1d14ae84e9ba1a3f22377dbe01fc44b5
BLAKE2b-256 0a22148f413a83d0d705b002f01210a159fc03e37d8a87f7cf67bf66e8fde89e

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 45c71a6071833cf1f39231f13a5635f5fee83b7d5f8167a94f94f95ebfb3f711
MD5 347465f46f443f714c92a44693ee4b21
BLAKE2b-256 9319d6480694eb8d52d78d0af2a6e984481206d41076c3b3f419a281db59a211

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d61582e9364643663c6cf672599db70b478c15584f52ecbfc1dfddba9fe1a282
MD5 e920e564439bcfbd1621a7c607e21616
BLAKE2b-256 92a1f38d8149e54614eea5e18fd005bab91cf586ca979107436882c30b4a1d84

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9b0e1c5a032588dbdc1ee3a4bb8a1167f531f998b45cde199c4c2a9016cfd46e
MD5 889f8b33dc7364f3013f50bd0f540137
BLAKE2b-256 f6b14ba633f7763b9be7ca35f4f9174ebd1a9d3d4910b9b07f656615925b5209

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ad993476266c9e5880bc23f5b5ccdbd193b7db731314c3d6008774aecc1fb3ec
MD5 7f7df96fc95655aa5b0ed7c0dd557f1d
BLAKE2b-256 2ccdda218339072773641a273c3af8a524293a75a557681a4a6cfaa8b947b283

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 405374cb216bf1aba7618ee38145c7e036cc35fa8d94c57ae64059b315326d66
MD5 4d22e9b92ca553002c0061714881f3bb
BLAKE2b-256 5fcf872901c51e3578a876e53b559d195e21af6cefb63ee4f56023f6f8410682

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3b127045a63d7cbaee24ade0ffc263e3d1118eda165ce139531004ebf29c05a7
MD5 b41a2bfac40ffbd6740da5e37c16fe94
BLAKE2b-256 132eff1e624176da0851bcd0959caecbf54ac7fca2387f951c59beb8930a9764

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 282ba13f1d6cb5cc34821235288d070472741259123703e418e4e3e959628b8a
MD5 7469a5e539745bdfc38c4463c1ade043
BLAKE2b-256 71d9482fdd57b51c88cfdb2616ae182d7ccf6010505b5eccdfc6ef5c986f3770

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4f5944dd8703c4c4a89358a27a6335cecee205fc49d513ecbf0e6e83359256e6
MD5 a9d27e0996abee848f28512850378110
BLAKE2b-256 890f32d9bcc869f3f3c82ed7f3be99494adb74f104c334e7dc76151c99775242

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 31a56bec27acbdbc2d2db048c4bbcaec3b8182960037b3f9ff67c15e9e93b609
MD5 3daee4d725362a07b933d43008f926b6
BLAKE2b-256 2625fff209329d24d1f259a2b7af10488cd34839f1d9285b7ee07892d2826d5c

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 07cae064a5d9755ac8c46016f693dde9a9d13ff04674140877d50238e5ff1c1d
MD5 447ff5c4e84896d918d1080e78e93a7d
BLAKE2b-256 05aea646069346d8dc4d4307717fdef05bd6946be4be5e68f07d630192e1e96b

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 cdf8e4bd36992f3601aa6e0f707f315437c7a6883a0e5b109aca85250bc41a24
MD5 910b461599490eb613c3a061731c366a
BLAKE2b-256 feea7200f29b8a6f0bdea216b8eb8bd99c25345d3dfea54396ec72b80f681125

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 45d4abbf501d035b38217308515676c3c2d2058f88e7b8c747ff1c94145f2acd
MD5 b5ba808e25957fd64771a3054c07e04a
BLAKE2b-256 6a51f5c666d5b5e41748c15cec71a68d573b6144e2f6bb5b9f1b1b2869488e14

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 95382db504a60b257681b38d571a57e3b1023ad3dda114c0843186d8a820b205
MD5 dd7201588c65f9b26ed1cc2ef5856a88
BLAKE2b-256 cab09f6a6503b7995d36ec865efd339a7a0717c8c08e8787d55d25d8cd9951ca

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.114.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.114.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 6be74bdaed43dc579984dee6d7abe0b421dc5094500cda7a329f337838b0987c
MD5 03e25efdc54c92a27d7614e09efe13b1
BLAKE2b-256 94e431bf992905b399ae38d62aaf74f933f0d4fc418544642e1ce40e2c7ea333

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 55e99d0182ec2704bdb72c7c083e8fc816c98bf9746bc0fb00f2cec808299be9
MD5 07a427075659b695e349251662d09155
BLAKE2b-256 173f110c1292b76d172f75b71069213d82f4b657d6e5be9f5143bfb608b12360

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4f82cb2d59459652be30f063d7d733f3a294518d518a0bb63da6e856e6ea7836
MD5 b96cfda72a1ce2ce0fd6e089142d2b14
BLAKE2b-256 9947630ed66dea25ddd36866b817b686f3f18d0783d37f58ee275c20a4e363c6

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 35dacacac3b87c54024b87e2f358f82620c0544751900cee922292a4577b433b
MD5 eb938e5fe9d19307f118e25936162d89
BLAKE2b-256 2e2112ae2cd9fab2bdddf614ba22c51167c0010ce78275c12b83ce3a8d18ef4b

See more details on using hashes here.

File details

Details for the file chalkpy-2.114.2-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.114.2-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 046db434d1085d11d149aad025eefa28f3aeb3bb1e227609006d94508afd1ce8
MD5 f574753fef6efc69b192d80245fe024a
BLAKE2b-256 af61d37bc5bdd491081407f7dc9b751692dd0a6b833d2511c87081ecfe9f1657

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page