Skip to main content

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline. [Documentation](https://gokart.readthedocs.io/en/latest/)

Project description

gokart

Test Python Versions

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.

Documentation for the latest release is hosted on readthedocs.

About gokart

Here are some good things about gokart.

  • The following meta data for each Task is stored separately in a pkl file with hash value
    • task output data
    • imported all module versions
    • task processing time
    • random seed in task
    • displayed log
    • all parameters set as class variables in the task
  • Automatically rerun the pipeline if parameters of Tasks are changed.
  • Support GCS and S3 as a data store for intermediate results of Tasks in the pipeline.
  • The above output is exchanged between tasks as an intermediate file, which is memory-friendly
  • pandas.DataFrame type and column checking during I/O
  • Directory structure of saved files is automatically determined from structure of script
  • Seeds for numpy and random are automatically fixed
  • Can code while adhering to SOLID principles as much as possible
  • Tasks are locked via redis even if they run in parallel

All the functions above are created for constructing Machine Learning batches. Provides an excellent environment for reproducibility and team development.

Here are some non-goal / downside of the gokart.

  • Batch execution in parallel is supported, but parallel and concurrent execution of task in memory.
  • Gokart is focused on reproducibility. So, I/O and capacity of data storage can become a bottleneck.
  • No support for task visualize.
  • Gokart is not an experiment management tool. The management of the execution result is cut out as Thunderbolt.
  • Gokart does not recommend writing pipelines in toml, yaml, json, and more. Gokart is preferring to write them in Python.

Getting Started

Within the activated Python environment, use the following command to install gokart.

pip install gokart

Quickstart

A minimal gokart tasks looks something like this:

import gokart

class Example(gokart.TaskOnKart):
    def run(self):
        self.dump('Hello, world!')

task = Example()
output = gokart.build(task)
print(output)

gokart.build return the result of dump by gokart.TaskOnKart. The example will output the following.

Hello, world!

This is an introduction to some of the gokart. There are still more useful features.

Please See Documentation .

Have a good gokart life.

Achievements

Gokart is a proven product.

Thanks

gokart is a wrapper for luigi. Thanks to luigi and dependent projects!

Project details


Release history Release notifications | RSS feed

This version

1.2.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gokart-1.2.0.tar.gz (27.5 kB view details)

Uploaded Source

Built Distribution

gokart-1.2.0-py3-none-any.whl (34.1 kB view details)

Uploaded Python 3

File details

Details for the file gokart-1.2.0.tar.gz.

File metadata

  • Download URL: gokart-1.2.0.tar.gz
  • Upload date:
  • Size: 27.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.10.5 Linux/5.15.0-1014-azure

File hashes

Hashes for gokart-1.2.0.tar.gz
Algorithm Hash digest
SHA256 4629ed66b4827f572c8e8db0bc61267072c9f5eb5e27844a1a07b64a18cdcc41
MD5 a28bce33ac176a17d6132e5fa6ab438f
BLAKE2b-256 e0eba58d19bc1d3c56b826411c85612a6fd11fd92ae253f35b2add1a65ac93e2

See more details on using hashes here.

File details

Details for the file gokart-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: gokart-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 34.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.10.5 Linux/5.15.0-1014-azure

File hashes

Hashes for gokart-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 beb912d3106aa50abe07cf2a744cdae58c3b3d0a95f9ae5cc39224a14e12c102
MD5 6159bd3ba6e22eb19fbc6bc4642fe48c
BLAKE2b-256 2fba9bd54c1cd9d1ff2cb1e142cc332a6876190a4d9e85d52be5261fa89531df

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page