cmind
Project description
Collective Mind toolkit (CM or CK2)
The CM toolkit transforms Git repositories, Docker containers, Jupyter notebooks and zip/tar files into a database of reusable artifacts and automations with a unified CLI and extensible meta descriptions.
Our goal is to provide a very simple and common structure for shared projects and make it possible to exchange any artifacts, knowledge, experience and best practices between researchers, engineers, teams and organizations in a more automated, reusable and reproducible way.
CM is motivated by our tedious experience reproducing 150+ ML and Systems papers when we and our colleagues spent many months analyzing the structure of ad-hoc projects, reproducing results and validating them in the real world with different and continuously changing software, hardware, environments, data sets and settings.
The CM toolkit is based on the Collective Knowledge concept that was successfully validated in the past few years to enable collaborative ML and Systems R&D, modularize the MLPerf inference benchmark, and automate the development and deployment of Pareto-efficient ML Systems.
Try it yourself
Install CM
CM toolkit is implemented as a small Python library with a unified CLI and a simple API.
It requires minimal dependencies (Python 3+, pip, pyyaml and a Git client) and should work with any OS including Linux, CentOS, Debian, RedHat and Windows.
$ pip3 install cmind
You can find more details about the installation process here.
Share some artifact
Without CM
Image you want to share with your colleagues an image of a cat, some machine learning model and a JSON file with some experimental results including inference time and image classification via some GitHub repository.
First, you will likely create a GitHub repository and clone it on your local machine:
$ git clone {GitHub repo URL} my-cool-project
You may then create some directories to store your image, model and experiment:
$ cd my-cool-project
$ mkdir images
$ copy cool-cat.jpeg images
$ mkdir models
$ copy my-cool-model.onnx models
$ mkdir experiments
$ copy my-cool-result-20220404.json experiments
You will then likely create a README.md describing the structure and the content of your repository, and how you ran your experiment.
Another person will need to read this README file to understand the structure of your repository and either reproduce results or use some artifacts in his or her own project.
Using CM
The idea behind CM is to let you perform similar steps just prefixed by cm to let CM index artifacts and make them findable and reusable:
$ cm repo pull my-cool-project --url={GitHub repo URL}
CM will pull and register this repository. You can find where it is located on your system using CM command:`
$ cm repo find my-cool-project
in the CM format
Go to your local Git repository or any project:
$ cd <my cool project>
News
-
2022 April 20: Join us at the public MLCommons community meeting. Register here.
-
2022 April 3: We presented our approach to bridge the growing gap between ML Systems research and production at the HPCA'22 workshop on benchmarking deep learning systems.
-
2022 March: We presented our concept to enable collaborative and reproducible ML Systems R&D at the SIAM'22 workshop on "Research Challenges and Opportunities within Software Productivity, Sustainability, and Reproducibility"
-
2022 March: we've released the first prototype of our toolkit based on your feedback and our practical experience reproducing 150+ ML and Systems papers and validating them in the real world. !
Research and development
CM core enhancements
We use GitHub tickets to improve and enhance the CM core based on the feedback from our users!
CM-based automation recipes
- We work with the community to convert projects from ML and Systems papers into reusable CM artifacts and automation recipes. Feel free to suggest your own automation recipes to be reused by the community.
CM-based projects
- Towards modular MLPerf benchmark.
- MLPerf design space exploration.
- Automated deployment of Pareto-efficient ML Systems.
Resources
Acknowledgments and feedback
We thank the CK users, OctoML, MLCommons and all our colleagues for their valuable feedback and support!
Please don't hesitate to share your ideas and report encountered issues here.
Contacts
- Grigori Fursin - author and coordinator
- Arjun Suresh - coordinator and maintainer
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.