cmind
Project description
Collective Mind toolkit
The Collective Mind toolkit (CM or CK2) transforms Git repositories, Docker containers, Jupyter notebooks and zip/tar files into a collective database of reusable artifacts and automation scripts with a unified interface and extensible meta descriptions.
It is motivated by our tedious experience reproducing 150+ ML and Systems papers when our colleagues have spent many months analyzing the structure of ad-hoc projects, reproducing results and validating them in the real world with different and continuously changing software, hardware, environments, data sets and settings.
That is why we have decided to develop a simple toolkit to help you share your artifacts, knowledge, experience and best practices with the world in a more reusable, automated, portable and reproducible way.
The CM toolkit is based on the Collective Knowledge concept that was successfully validated in the past few years to enable collaborative ML and Systems R&D, modularize the MLPerf inference benchmark, and automate the development and deployment of Pareto-efficient ML Systems.
See a few related slides and a related article about "MLOps Is a Mess But That's to be Expected" (March 2022).
License
Apache 2.0
How it works
- Check this getting started tutorial to undestand the Collective Mind concepts and try this toolkit.
Community meetings
- TBA: Regular conf-calls
- TBD: Public notes
News
-
2022 April 20: Join us at the public MLCommons community meeting. Register here.
-
2022 April 3: We presented our approach to bridge the growing gap between ML Systems research and production at the HPCA'22 workshop on benchmarking deep learning systems.
-
2022 March: We presented our concept to enable collaborative and reproducible ML Systems R&D at the SIAM'22 workshop on "Research Challenges and Opportunities within Software Productivity, Sustainability, and Reproducibility"
-
2022 March: we've released the first prototype of our toolkit based on your feedback and our practical experience reproducing 150+ ML and Systems papers and validating them in the real world. !
Research and development
CM core enhancements
We use GitHub tickets to improve and enhance the CM core based on the feedback from our users! Please don't hesitate to share your ideas and report encountered issues!
CM-based automation scripts
- We work with the community to transform R&D projects from ML and Systems papers into reusable CM artifacts and automation scripts. Feel free to suggest your own automation recipes to be reused by the community.
CM-based projects
- Universal benchmarking of computational systems.
- Towards modular MLPerf benchmark.
- MLPerf design space exploration.
- Automated deployment of Pareto-efficient ML Systems.
Resources
Acknowledgments
We thank the users and partners of the original CK framework, OctoML, MLCommons and all our colleagues for their valuable feedback and support!
Contacts
- Grigori Fursin - author and coordinator
- Arjun Suresh - coordinator and maintainer
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.