Collective Knowledge - a lightweight knowledge manager to organize, cross-link, share and reuse artifacts and workflows based on FAIR principles
Project description
Note that this directory is in the archive mode since the Collective Knowledge framework (v1 and v2) is now officially discontinued in favour of the new, light-weight, non-intrusive and technology-agnostic Collective Mind workflow workflow automation framework . You can learn more about the motivation behind CK in this ACM TechTalk and the journal article.
Collective Knowledge framework (CK)
News
-
2022 May: Grigori Fursin started prototyping the new Collective Mind framework within the MLCommons Task Force on Automation and Reproducibility from scratch based on the feedback from the users and MLCommons members.
-
2022 April 3: We presented the CK concept to bridge the growing gap between ML Systems research and production at the HPCA'22 workshop on benchmarking deep learning systems.
-
2022 March: We presented the CK concept to enable collaborative and reproducible ML Systems R&D at the SIAM'22 workshop on "Research Challenges and Opportunities within Software Productivity, Sustainability, and Reproducibility"
-
2022 March: we've released the first prototype of the Collective Mind toolkit (CK2) based on your feedback and our practical experience reproducing 150+ ML and Systems papers and validating them in the real world.
Motivation
While Machine Learning is becoming more and more important in everyday life, designing efficient ML Systems and deploying them in the real world is becoming increasingly challenging, time consuming and costly. Researchers and engineers must keep pace with rapidly evolving software stacks and a Cambrian explosion of hardware platforms from the cloud to the edge. Such platforms have their own specific libraries, frameworks, APIs and specifications and often require repetitive, tedious and ad-hoc optimization of the whole model/software/hardware stack to trade off accuracy, latency, throughout, power consumption, size and costs depending on user requirements and constraints.
The CK framework
The Collective Knowledge framework (CK) is our attempt to develop a common plug&play infrastructure that can be used by the community similar to Wikipedia to learn how to solve above challenges and make it easier to co-design, benchmark, optimize and deploy Machine Learning Systems in the real world across continuously evolving software, hardware and data sets (see our ACM TechTalk for more details):
-
CK aims at providing a simple playground with minimal software dependencies to help researchers and practitioners share their knowledge in the form of reusable automation recipes with a unified Python API, CLI and meta description:
-
CK helps to organize software projects and Git repositories as a database of above automation recipes and related artifacts based on FAIR principles as described in our journal article (shorter pre-print). See examples of CK-compatible GitHub repositories:
Community developments
We collaborated with the community to reproduce 150+ ML and Systems papers and implement the following reusable automation recipes in the CK format:
-
Portable meta package manager to automatically detect, install or rebuild various ML artifacts (ML models, data sets, frameworks, libraries, etc) across different platform and operating systems including Linux, Windows, MacOS and Android:
-
Portable manager for Python virtual environments: CK repo.
-
Portable workflows to support collaborative, reproducible and cross-platform benchmarking:
-
Portable workflows to automate MLPerf™ benchmark:
- End-to-end submission suite used by multiple organizations to automate the submission of MLPerf inference benchmark
- MLPerf inference v1.1 results: MLCommons press-release, Datacenter results, Edge results
- Reproducibility studies for MLPerf inference benchmark v1.1 automated by CK
- Design space exploration of ML/SW/HW stacks and customizable visualization
- End-to-end submission suite used by multiple organizations to automate the submission of MLPerf inference benchmark
Please contact Grigori Fursin if you are interested to join this community effort!
Tutorials
- CK automations for unified benchmarking
- CK-based MLPerf inference benchmark automation example
- CK basics
Releases
Stable versions
The latest version of the CK automation suite supported by MLCommons™:
Current projects
- Automating MLPerf(tm) inference benchmark and packing ML models, data sets and frameworks as CK components with a unified API and meta description
- Developing customizable dashboards for MLPerf™ to help end-users select ML/SW/HW stacks on a Pareto frontier: aggregated MLPerf™ results
- Providing a common format to share artifacts at ML, systems and other conferences: video, Artifact Evaluation
- Redesigning CK together with the community based on user feedback: incubator
- Other real-world use cases from MLPerf™, Qualcomm, Arm, General Motors, IBM, the Raspberry Pi foundation, ACM and other great partners;
Documentation
Installation
Follow this guide to install CK framework on your platform.
CK supports the following platforms:
As a host platform | As a target platform | |
---|---|---|
Generic Linux | ✓ | ✓ |
Linux (Arm) | ✓ | ✓ |
Raspberry Pi | ✓ | ✓ |
MacOS | ✓ | ± |
Windows | ✓ | ✓ |
Android | ± | ✓ |
iOS | TBD | TBD |
Bare-metal (edge devices) | - | ± |
Examples
Portable CK workflow (native environment without Docker)
Here we show how to pull a GitHub repo in the CK format and use a unified CK interface to compile and run any program (image corner detection in our case) with any compatible data set on any compatible platform:
python3 -m pip install ck
ck pull repo:mlcommons@ck-mlops
ck ls program:*susan*
ck search dataset --tags=jpeg
ck pull repo:ctuning-datasets-min
ck search dataset --tags=jpeg
ck detect soft:compiler.gcc
ck detect soft:compiler.llvm
ck show env --tags=compiler
ck compile program:image-corner-detection --speed
ck run program:image-corner-detection --repeat=1 --env.MY_ENV=123 --env.TEST=xyz
You can check output of this program in the following directory:
cd `ck find program:image-corner-detection`/tmp
ls
processed-image.pgm
You can now view this image with detected corners.
Check CK docs for further details.
MLPerf™ benchmark workflows
Portable CK workflows inside containers
We have prepared adaptive CK containers to demonstrate MLOps capabilities:
You can run them as follows:
ck pull repo:mlcommons@ck-mlops
ck build docker:ck-template-mlperf --tag=ubuntu-20.04
ck run docker:ck-template-mlperf --tag=ubuntu-20.04
Portable workflow example with virtual CK environments
You can create multiple virtual CK environments with templates to automatically install different CK packages and workflows, for example for MLPerf™ inference:
ck pull repo:mlcommons@ck-venv
ck create venv:test --template=mlperf-inference-main
ck ls venv
ck activate venv:test
ck pull repo:mlcommons@ck-mlops
ck install package --ask --tags=dataset,coco,val,2017,full
ck show env
Integration with web services and CI platforms
All CK modules, automation actions and workflows are accessible as a micro-service with a unified JSON I/O API to make it easier to integrate them with web services and CI platforms as described here.
Other use cases
CK portal
We use the cKnowledge.io portal to help the community organize and find all the CK workflows and components similar to PyPI:
- Search CK components
- Browse CK components
- Find reproduced results from papers
- Test CK workflows to benchmark and optimize ML Systems
Containers to test CK automation recipes and workflows
The community provides Docker containers to test CK and components using different ML/SW/HW stacks (DSE).
- A set of Docker containers to test the basic CK functionality using some MLPerf inference benchmark workflows: https://github.com/mlcommons/ck-mlops/tree/main/docker/test-ck
Acknowledgments
We would like to thank all collaborators and contributors for their support, fruitful discussions, and useful feedback! See more acknowledgments in this journal article and ACM TechTalk'21.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ck-2.6.4.tar.gz
.
File metadata
- Download URL: ck-2.6.4.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 278385aaf337e15d4f7f54087853c74621b5a565253c68a96db3072ec33d0d2d |
|
MD5 | c4f41b279c42188409c568d6317577e1 |
|
BLAKE2b-256 | eb3b25b229e40173deb0dc0738f1b7a46e5180e9974d548231eb1d17c82194c6 |