Skip to main content

Python SDK for Chalk

Project description

Chalk

Chalk enables innovative machine learning teams to focus on building the unique products and models that make their business stand out. Behind the scenes Chalk seamlessly handles data infrastructure with a best-in-class developer experience. Here’s how it works –


Develop

Chalk makes it simple to develop feature pipelines for machine learning. Define Python functions using the libraries and tools you're familiar with instead of specialized DSLs. Chalk then orchestrates your functions into pipelines that execute in parallel on a Rust-based engine and coordinates the infrastructure required to compute features.

Features

To get started, define your features with Pydantic-inspired Python classes. You can define schemas, specify relationships, and add metadata to help your team share and re-use work.

@features
class User:
    id: int
    full_name: str
    nickname: Optional[str]
    email: Optional[str]
    birthday: date
    credit_score: float
    datawarehouse_feature: float

    transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)

Resolvers

Next, tell Chalk how to compute your features. Chalk ingests data from your existing data stores, and lets you use Python to compute features with feature resolvers. Feature resolvers are declared with the decorators @online and @offline, and can depend on the outputs of other feature resolvers.

Resolvers make it easy to rapidly integrate a wide variety of data sources, join them together, and use them in your model.

SQL

pg = PostgreSQLSource()

@online
def get_user(uid: User.id) -> Features[User.full_name, User.email]:
    return pg.query_string(
        "select email, full_name from users where id=:id",
        args=dict(id=uid)
    ).one()

REST

import requests

@online
def get_socure_score(uid: User.id) -> Features[User.socure_score]:
    return (
        requests.get("https://api.socure.com", json={
            id: uid
        }).json()['socure_score']
    )

Execute

Once you've defined your features and resolvers, Chalk orchestrates them into flexible pipelines that make training and executing models easy.

Chalk has built-in support for feature engineering workflows -- no need to manage Airflow or orchestrate complicated streaming flows. You can execute resolver pipelines with declarative caching, ingest batch data on a schedule, and easily make slow sources available online for low-latency serving.

Caching

Many data sources (like vendor APIs) are too slow for online use cases and/or charge a high dollar cost-per-call. Chalk lets you optimize latency and cost by defining declarative caching policies which are well-integrated throughout the system. You no longer have to manage Redis, Memcached, DynamodDB, or spend time tuning cache-warming pipelines.

Add a caching policy with one line of code in your feature definition:

@features
class ExternalBankAccount:
-   balance: int
+   balance: int = feature(max_staleness="**1d**")

Optionally warm feature caches by executing resolvers on a schedule:

@online(cron="**1d**")
def fn(id: User.id) -> User.credit_score:
  return redshift.query(...).all()

Or override staleness tolerances at query time when you need fresher data for your models:

chalk.query(
    ...,
    outputs=[User.fraud_score],
    max_staleness={User.fraud_score: "1m"}
)

Batch ETL ingestion

Chalk also makes it simple to generate training sets from data warehouse sources -- join data from services like S3, Redshift, BQ, Snowflake (or other custom sources) with historical features computed online. Specify a cron schedule on an @offline resolver and Chalk automatically ingests data with support for incremental reads:

@offline(cron="**1h**")
def fn() -> Feature[User.id, User.datawarehouse_feature]:
  return redshift.query(...).incremental()

Chalk makes this data available for point-in-time-correct dataset generation for data science use-cases. Every pipeline has built-in monitoring and alerting to ensure data quality and timeliness.

Reverse ETL

When your model needs to use features that are canonically stored in a high-latency data source (like a data warehouse), Chalk's Reverse ETL support makes it simple to bring those features online and serve them quickly.

Add a single line of code to an offline resolver, and Chalk constructs a managed reverse ETL pipeline for that data source:

@offline(offline_to_online_etl="5m")

Now data from slow offline data sources is automatically available for low-latency online serving.


Deploy + query

Once you've defined your pipelines, you can rapidly deploy them to production with Chalk's CLI:

chalk apply

This creates a deployment of your project, which is served at a unique preview URL. You can promote this deployment to production, or perform QA workflows on your preview environment to make sure that your Chalk deployment performs as expected.

Once you promote your deployment to production, Chalk makes features available for low-latency online inference and offline training. Significantly, Chalk uses the exact same source code to serve temporally-consistent training sets to data scientists and live feature values to models. This re-use ensures that feature values from online and offline contexts match and dramatically cuts development time.

Online inference

Chalk's online store & feature computation engine make it easy to query features with ultra low-latency, so you can use your feature pipelines to serve online inference use-cases.

Integrating Chalk with your production application takes minutes via Chalk's simple REST API:

result = ChalkClient().query(
    input={
        User.name: "Katherine Johnson"
    },
    output=[User.fico_score],
    staleness={User.fico_score: "10m"},
)
result.get_feature_value(User.fico_score)

Features computed to serve online requests are also replicated to Chalk's offline store for historical performance tracking and training set generation.

Offline training

Data scientists can use Chalk's Jupyter integration to create datasets and train models. Datasets are stored and tracked so that they can be re-used by other modelers, and so that model provenance is tracked for audit and reproducibility.

X = ChalkClient.offline_query(
    input=labels[[User.uid, timestamp]],
    output=[
        User.returned_transactions_last_60,
        User.user_account_name_match_score,
        User.socure_score,
        User.identity.has_verified_phone,
        User.identity.is_voip_phone,
        User.identity.account_age_days,
        User.identity.email_age,
    ],
)

Chalk datasets are always "temporally consistent." This means that you can provide labels with different past timestamps and get historical features that represent what your application would have retrieved online at those past times. Temporal consistency ensures that your model training doesn't mix "future" and "past" data.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chalkpy-2.117.0.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chalkpy-2.117.0-cp313-cp313-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.13Windows x86-64

chalkpy-2.117.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

chalkpy-2.117.0-cp313-cp313-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chalkpy-2.117.0-cp313-cp313-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chalkpy-2.117.0-cp313-cp313-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

chalkpy-2.117.0-cp312-cp312-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.12Windows x86-64

chalkpy-2.117.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

chalkpy-2.117.0-cp312-cp312-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chalkpy-2.117.0-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chalkpy-2.117.0-cp312-cp312-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

chalkpy-2.117.0-cp311-cp311-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.11Windows x86-64

chalkpy-2.117.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

chalkpy-2.117.0-cp311-cp311-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chalkpy-2.117.0-cp311-cp311-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chalkpy-2.117.0-cp311-cp311-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

chalkpy-2.117.0-cp310-cp310-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.10Windows x86-64

chalkpy-2.117.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

chalkpy-2.117.0-cp310-cp310-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chalkpy-2.117.0-cp310-cp310-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chalkpy-2.117.0-cp310-cp310-macosx_10_13_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file chalkpy-2.117.0.tar.gz.

File metadata

  • Download URL: chalkpy-2.117.0.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.117.0.tar.gz
Algorithm Hash digest
SHA256 4d01e5de62e2fc444cebc39d819baf19eeeef6361f54da9bbd4827af1670616a
MD5 9a90a692682b29aea5e258a863fd3f7e
BLAKE2b-256 3762442814fd1b4b76f7bfa415386b27e22d3432eb335d58d2515118ecd6c93d

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.117.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.117.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 b8380dca2e11ae87fdd71a8c7db825be65f126fb60315ed5bcc496b91b424cec
MD5 0c8f25308462484d1a3a195cd8fdca82
BLAKE2b-256 b14be298e4a3bffd39f9a088d96c55cfac7043c7d7db9baef78232196eebb5fd

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 27cfda285d120598489bf784bc68b51b7bc53b99b7839d5e64ecd8bc68ce2081
MD5 2f6a5680ce3fca4d5de7eac5b14c0dfc
BLAKE2b-256 56d18b5a867a8c69f150ee534e891c96ff7c383bde6192386910c77350130967

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 80f43f9ada335dfc4c68ae950b62e6bea59d9bcdbb32354d873728a6703908e2
MD5 0ee0b962ae81f939c4542efc37ae37cf
BLAKE2b-256 f852a9c95391b4a1dd1adc7671dbbf79e3e293f4eea339d6a6abf296f49dee25

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eae8a9514f8e0d8917071a832af64673184280cd435c7732497c6c1c80a467e4
MD5 c60b85ec7f954ed1157bba4493e244db
BLAKE2b-256 cb81846adfc2e248af273ab0c46372512a291ee6901ac5bbb15caffa7d23b5c0

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 36fa76c5fa5b47722d8a6884ce62ae321bba9a2aeb0e5f86efa07bd529f96810
MD5 e37da61b851314cde0913c10b0859c5f
BLAKE2b-256 b246ce9d2f82dbdddeab9fddf62829fd43fa7e182154d6da476aeba7920c13a2

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.117.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.117.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 7f2c16f60632b6ee5c0948df9c298372d5193e2fe2f8ff79d50c048224f7e0db
MD5 4b984eed0ba9ef1b578137fe58101986
BLAKE2b-256 b047cf4d7c3d4f277aaa1bf3ab20f1e533d25ff5b3dc2349ba4b52483766970b

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2a68ab5e699ef399cf7c4c951186369b342493ac0638a213cb5054a1b7e285ee
MD5 fea805cba13faba920e4ce811e43915e
BLAKE2b-256 7748b99d2dadacddb5abb732af0d118dd23724f86ed582a0c687c6a469f686e8

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fb07f2add4dc7999af6746fc15666fbe2699fb698e71aa85531dd08ae2d5fe98
MD5 e0e45105ec2c6fcafa2d048159e52c56
BLAKE2b-256 5df6e1e87564012a03ac3c208004a06a4ab329a503c4e68b5b12b5b22a4700a9

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 218f1917031eaf8a7cd2b47d4f41bb33b82ff0225e0ac4e052033b986b79569f
MD5 21bdde023bcf505d11789c146df54c2b
BLAKE2b-256 cb56d56fba24d90e989311f001aa996207583b5963262491c367b0cb2312f2d6

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 640c290addb14802b384191956547c9f9e9d71fa6ea78598c27d8a027333a866
MD5 93048305121eb71ce934b3cebedcd9da
BLAKE2b-256 4a275f3c7fc7096da3de1456a10be79e304b81ff29244c5a8974a05e67cea1b9

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.117.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.117.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 8d994bb162f7088165cc15a00521db15151fe53769d40f98c7a2028d50d0fca7
MD5 4855a7db530331507c5198429770db93
BLAKE2b-256 6d845cc6a5845393b742eca3be5cc402c56bba99943813f1e196d7be9f198f5f

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 24c9eb8c82b5bc0e2d14f321c5bd2f3c4cd15ed3a175cc3a23d77b228fe2112c
MD5 a7f83435edc50febd217fc6c54fe3862
BLAKE2b-256 3d5fa1bb924c2742192634f1d9d66111b697dbd6e993230dadcddab29c7f9bc5

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5e81df8ed9304e1fe478f0865bb25556ef63ef75e7522c85e038efbddec78bd5
MD5 6d2d27bee6eb20276330493a5002bc5f
BLAKE2b-256 eafc71d7c60b2dd90761b23f4ead17c82b57932ed60c99b031d3714366d1ad0a

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2f096eb5d4052bb9481b8fad8d3b6aae81b21ab5cee52b1cc77921ed3c807ecb
MD5 cf609f11d6848388f2f14927b83aae9f
BLAKE2b-256 e1a6e3c63e44a2c5ebf48195ae1c37377c7cbe665cbb87d58d9d6c25b11070f8

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 56cee109153b1677a724612bfd8a3b5c6651afaf19cc78d4e71f087f7e4e014d
MD5 fb57126dcedffa16023988c32cbdf998
BLAKE2b-256 d0e2c1181fc46378cf8dcbf1a2b10c89a4c7b00e23913e4333f7aae3d1ed2f74

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: chalkpy-2.117.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for chalkpy-2.117.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2a333587acfa1348c0e42defebd71e85a93aaf273107da0cd391f4fb2bf08cb4
MD5 22176f5b8254687dc1ae0c31e4e885ab
BLAKE2b-256 fbd5ba11082026fbd408b7dc98ccd6ed8fdb5e909a4a0e8130d80de6f1d27c22

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ff94636b0851f81a12207327ca454e0fb033fb0b1a14e5d4625c560e9e67c564
MD5 4e14cd23c7b265c55ca6de672abac8bd
BLAKE2b-256 87bc3df11a9f13b3429b08618679faa9ebb5348dd081490a8f016919e164fd4c

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 24dfd1e4c9d568531596daa5e0d8237d6e0e8f594f6b05a22531f79ffb15e0a0
MD5 81826e857ece1d4611344edc2fd19f13
BLAKE2b-256 dd871ee7b52c65930ed93c5a1171ee815e667a7118c622e664ae5fdc06ecdce5

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c40e152729a843d77f259a33c8fccf1d773b0ae21fcab6cde92cb6bb9979652e
MD5 9b1ae80ce53608f90b8c6df9a5f15021
BLAKE2b-256 5238b1ad56560b6cabdd3a89c0bae2b5dd223e5e3fcdb22de6f35e86dfe8acb4

See more details on using hashes here.

File details

Details for the file chalkpy-2.117.0-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for chalkpy-2.117.0-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9055846be44fd47e6d9c0a4907b86155199657b22176a9093cecae4acdee5709
MD5 e5c3474f666d821348b6b0e085d21b0d
BLAKE2b-256 ed8458b8e89be9c02f28954d6ba1decee4e3e84472a2529d9bfa0c09b5a86dce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page