Skip to main content

No project description provided

Project description

nebullvm nebuly AI accelerate inference optimize DeepLearning

AI Optimization AppStore

Nebullvm is an ecosystem of open-source Apps to boost the performances of your AI systems. The optimization Apps are stack-agnostic and work with any library.

Data. Models. Hardware. These are not independent factors, and making optimal choices on all fronts is hard. Our open source Apps help you to combine these 3 factors seamlessly, thus bringing incredibly fast and efficient AI systems to your fingertips. Four Apps categories to push the boundaries of AI efficiency. Dozens of Apps.

If you like the idea, give us a star to show your support for the project ⭐

Accelerate Apps

Achieve sub-10ms response time for any AI application, including generative and language models. Improve customer experience by providing near real-time inferences.

  • Speedster: Automatically apply SOTA optimization techniques to achieve the maximum inference speed-up on your hardware.
  • OptiMate: Interactive tool guiding savvy users in achieving the best inference performance out of a given model / hardware setup.
  • LargeSpeedster: Automatically apply SOTA optimization techniques on large AI models to achieve the maximum acceleration on your hardware.
  • CloudSurfer: Discover the optimal inference hardware and cloud platform to run an optimized version of your AI model.
  • MatrixMaster: Boost your DL model's performance with MatrixMaster's custom-generated matrix multiplication algorithms (AlphaTensor open-source).

Maximize Apps

Make your Kubernetes GPU infrastructure efficient. Simplify cluster management, maximize hardware utilization and minimize costs.

  • GPU Partitioner: Effortlessly maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning.
  • GPUs Elasticity: Maximize your GPUs Kubernetes resource utilization with flexible and efficient elastic quotas.

Extract Apps

Don’t settle on generic AI-models. Extract domain-specific knowledge from large foundational models to create portable, super efficient AI models tailored for your use case.

  • Promptify: Effortlessly fine-tune large language and multi-modal models with minimal data and hardware requirements using p-tuning.
  • LargeOracle Distillation: Leverage advanced knowledge distillation to extract a small and efficient model out of a larger model.

Simulate Apps

The time for trial and error is over. Simulate the performances of large models on different computing architectures to reduce time-to-market, maximize accuracy and minimize costs.

  • Simulinf: Simulate inference performances of your AI model on different hardware and cloud platforms.
  • [ ] TrainingSim: Easily simulate and optimize the training of large AI models on a distributed infrastructure.

Couldn't find the optimization app you were looking for? Please open an issue or contact us at info@nebuly.ai and we will be happy to develop it together.


Join the community | Contribute to the library

InstallationGet startedNotebooksBenchmarks

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nebullvm-0.6.0.tar.gz (90.0 kB view hashes)

Uploaded Source

Built Distribution

nebullvm-0.6.0-py3-none-any.whl (157.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page