Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.115.0.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.115.0-cp313-cp313-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.115.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.115.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.115.0-cp313-cp313-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.115.0-cp313-cp313-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.115.0-cp312-cp312-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.115.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.115.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.115.0-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.115.0-cp312-cp312-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.115.0-cp311-cp311-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.115.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.115.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.115.0-cp311-cp311-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.115.0-cp311-cp311-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.115.0-cp310-cp310-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.115.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.115.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.115.0-cp310-cp310-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.115.0-cp310-cp310-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.115.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.115.0.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.115.0.tar.gz
Algorithm Hash digest
SHA256 99a312ce29c15aeacd0d3f63b1963cc4053545cc86a6f7362bf3300a3fdfe5c8
MD5 51ab5921c1d7bdc7ec6ddf04d97e80b0
BLAKE2b-256 8dd415184780cc8a6bf28bfe4c5f7795776ccf35ad30248193fd159b7b875438

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.115.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.115.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 79869814d638f1cc1f88d44ddf76fd50500675012a3c7172c160ac3a3f0b195c
MD5 7f4f21e482319668d5fe9959c148d7dd
BLAKE2b-256 9116071527d6a6402a1f556fb4cbfd95984d83562e8b7a3e5601bf365b435e71

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1c5f3cf042956ad71dfdfe6c97a4b45207a7dbeaffe8daecc36a2838d5806a59
MD5 7259993897fb7eb5694bc940a3b9232e
BLAKE2b-256 79151137ac662d6f450a0e38c04a11a13a413b4d8075ecc7738f343b119888e7

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 56320c2e35b486bb2d72e52bd7461c52ca4d5992d04bfc019696257b84d21d65
MD5 67171f9f8868419724be5138dbfc3eaa
BLAKE2b-256 ab96706ed85554754dfca9088a11e4a38d5f465276a80c99f125775e0e11f57a

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6e271ac9051ffaca5de9825b4b98c4ef5adeb13bd7aed97897624b4f5132f53c
MD5 135c3bf456fcea175238c37315349149
BLAKE2b-256 d5c86ee7bf1268e47d66eb224d69661be9a5c493f6f9eec1215ede683b6018ab

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 d190689144fdb1ebeba2fb5329db98f9e857a7e0f5ae5676a157cfab4bb6fee9
MD5 76c9f45118c2738d438ed5a61aa3f8df
BLAKE2b-256 3a84eb4ff21150636ddce1763148317e345e83f7409161b4dff020ebdd6ecd10

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.115.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.115.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 c76d4551853cc8fe8e6734b29a09ca36afcb381bc49351349344055f5c9b1dba
MD5 cbf39e479f167c0f79ce9930301553a3
BLAKE2b-256 ac0421fff2a00d367bb6efa3b2744817299c62088bc410f185a8e7dca972c9dd

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 210e1c6ada67b477db7f6f6708daa5772e81657445a3629d61770dfa386b7434
MD5 8029ce48884cd9dc2c3dc34851fd490c
BLAKE2b-256 55a3fd72db4f44df13d3ab6636bae986d31b82ea84e1b6d9a8fd43ee36c996f0

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0e0f6f836193edc73d3f17711550689035874f8ed5637f5e630d3d0b18f3ff80
MD5 3ef07dfbf56168051e96ae8437a9909b
BLAKE2b-256 9af199f2592470cc0de8c72703b148efcc3693eacbcc4473a130122d8dac1f29

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 39925882b739360caf303aef7a386aad93725ff72f14cf41d969bf771c83bf47
MD5 a4d0c6a493d1b636decb7d0dcb8a5ba1
BLAKE2b-256 1562c5e85b973f75b7b8ab4947982fab5af8b6a860ec0e86064a13b4d9c71a83

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 1ef3f0747138afb908dba09ef2f63a20059d7c5355f963f24e5a448e4156135a
MD5 6a2ef059b2db429d544fe51a7667df8c
BLAKE2b-256 120860eb846f32318f733dab1360022a245d14f55ff198f272a419546052ddcd

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.115.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.115.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 c53c5fc6efae9167644c4e78746ef17d040055d26991d5067b76f82634e49768
MD5 6a15f1cdcc59a4766255abc1577ca769
BLAKE2b-256 acf841a44b6d4b0c763382a8667dfbf23e682b3c694ec04f29d7f585a6e85e29

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a515b03c953c16e25aef316748a579584fb784b0615125f5fe446b5e9dbb05fb
MD5 9bba6d5f7f919870f5960cef955940ce
BLAKE2b-256 5d3091b91ba93327d83795cbda171cdeda20fc4c912a9b567a8d0bd02264c4d2

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 54761395f97fa79b3e79cbd70dbd797b07db8d292b039d938f7de4e364c0c0bd
MD5 d3998986231566b79dfb480a14c9f789
BLAKE2b-256 c448e25f95e7acc3b8563bc38b3addfda9a9ebf0c5aa758ffbb5d0e3910cd640

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7f3a7213007e43b9c3274154fa5abfe3bbad9f17f10a30379d29f18839611d20
MD5 b3d475c72073bf7e936164c05425dc56
BLAKE2b-256 28809466d9526280e886a58e87e21fbd42a533710dd60abd09b62101116b1668

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f7d15923cff886d226d751915c6c42d6f3d987331c10e64d7dc14bdebdfeabba
MD5 9f69cff671f7e53181f34c957280de70
BLAKE2b-256 56fdda104cf469173bbceb9a5993fdca2fc5afb57ccb34962beea846b2aeeec7

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.115.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.115.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 099e87ad0267d804c15cad409dbc0538d219cfbd3a7de137f4b592644578965d
MD5 6a92fb62949821f4d360f0b7f0a3e208
BLAKE2b-256 1be56761a686948262f85168be00b2efc9278d2b1042202b80e77c8e6ec6f442

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7f244e4e11cca8f46e2ded4832e1b3874f2c21b39cf58c0e0f4bed6f1357b70c
MD5 32ebf66b6ae8f721f175db4ebef703d8
BLAKE2b-256 78ff9d8c3ff2ea11b560c0eeed556bd46c4a71ce093336e68f7e94962c9cf946

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 dad81e2e7abc556b3d49cfeb10c0c3a920cc1ae0de1742d227b44fd324d41e0b
MD5 b692fb7312aa9512cb2817f6585c1cf0
BLAKE2b-256 552eeb64d1fe057cba57dc8a5c6222f5e07db304452ad92499fc218dee4896a3

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9037caefad963166abbcead39b7ef921ee63e69adada2c2f8aff789ee507ddae
MD5 0f5705da356a436e4a9b1b645d92a01a
BLAKE2b-256 c60994898a701b388e81a89836d8c5b9dd32b84f677d9a9bd120af7c58a8d7fc

See more details on using hashes here.

File details

Details for the file chalkpy-2.115.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.115.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 29cc0782b628abb1512dd2319e7c893cc267231a286ae8af7ce0280e4c6af0b9
MD5 8087e1c520a27ce27e094f71247ffb3e
BLAKE2b-256 4d1f66ee83945f6f607ddfae62ab55b1983dfe9fe3599e1e7be048e6ef8d753d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page