package for machine learning meta-data management and analysis

These details have not been verified by PyPI

Project links

Homepage

Project description

ArangoML Pipeline

ArangoML Pipeline is a common and extensible Metadata Layer for Machine Learning Pipelines which allows Data Scientists and DataOps to manage all information related to their ML pipeline in one place.

News: ArangoML Pipeline Cloud is offering a no-setup, free-to-try managed service for ArangpML Pipeline. An ArangoML Pipeline Cloud tutorial is also available without any installation or signup.

Quick Start

To skip to Part 2 of this series and get started without any installations (using ArangoML Pipeline Cloud) , click :

The examples folder contains notebooks that illustrate the features of Arangopipe.

Overview

This overview is a part of a series of posts introducing ArangoML and showcasing its benefits to your machine learning pipelines. In this first post, we look at what exactly ArangoML is, with later posts in this series showcasing the different tools and use cases. You can follow along with this series by watching this repository or by signing up for the ArangoDB Newsletter.

If you have a use case you would like to see highlighted as a part of this series, please let us know on the ArangoML Slack channel.

ArangoML Tools/Stack

The ArangoDB tool stack for analytics allows you to explore relationships of interest in your ArangoDB graph. These include:

AQL: The ArangoDB Query Language is a declarative language that aims to be human-readable and supports the complex query patterns that are only possible within a multi-model database. These features make machine learning tasks more powerful and accessible to both the data scientists performing exploratory data analysis and the dev-ops personnel running ad hoc queries for maintenance and issue investigation. See the AQL Fundamentals for an overview of AQL and some examples of using AQL to query graphs in ArangoDB.
Pregel: Data engineers use Pregel, a system for large-scale graph processing, to implement computationally intensive data transformations, while data scientists use it for data analysis tasks on large graphs. See our pregel tutorial for examples and an overview of using Pregel with ArangoDB.
ArangoDB-NetworkX Adapter: NetworkX is a widely used tool for graph analytics; it provides robust implementations of many algorithms used for graph analytics. If your application requires such algorithms, you can leverage the implementation of these algorithms’ for your ArangoDB graphs. We provide an adapter to convert ArangoDB graphs to NetworkX graphs. You can leverage ArangoDB’s robust, reliable, and high-performing storage solution with the wide range of algorithms for graph analytics in NetworkX. We will cover our NetworkX implementation in greater detail in our upcoming network analytics post, be sure to sign up for our newsletter to be notified on the upcoming posts in this series. If you would like to explore some of the existing notebooks, you can find more examples in the repository.

Use Cases

ArangoML Pipeline can benefit many scenarios, such as:

Capture of lineage information (e.g., Which dataset influences which model?)
Capture of audit information (e.g, A given model was training two months ago with the following training/validation performance)
Reproducible model training
Model serving policy (e.g., Which model should be deployed in production based on training statistics)
Extension of existing ML pipelines through simple python/HTTP API

Learning Through Metadata

When machine learning pipelines are created, for example using TensorFlow Extended or Kubeflow, the capture of and access to metadata across the pipeline is vital. Typically, each component of such an ML pipeline produces or requires metadata, for example:

Data storage: size, location, creation date, checksum, ...
Feature Store (processed dataset): transformation, version, base datasets ...
Model Training: training/validation performance, training duration, ...
Model Serving: model linage, serving performance, ...

In contrast to exploring relationships or entities in a graph, machine learning applications learn autonomously and continuously with new data. A key idea in making this possible is the use of machine learning pipelines. They make it possible for applications to learn from new data. Managing machine learning pipelines is achieved through metadata.

Metadata describes the components and actions involved in building the machine learning pipeline. The steps involved in constructing the pipeline are expressed as a graph by most tools, making ArangoDB a natural fit to store and manage machine learning application metadata. Arangopipe is ArangoDB’s tool for managing machine learning pipelines. Machine learning libraries such as DGL accept NetworkX graphs as input. Therefore the ArangoDB-Networkx adapter can be used to develop machine learning applications with ArangoDB graphs using libraries such as DGL. An example of using this available here.

Knowledge Graphs

Thanks to the features that come with using a multi-model database, it is possible to work with Knowledge Graphs(KGs) in ArangoDB; this combines the benefits of still being machine-readable while having the human readability benefits of a property graph. If you would like to learn more about using KGs with ArangoDB take a look at the upcoming Knowledge Graphs in ArangoDB series.

Documentation

Please refer to the Arangopipe documentation for further information.

Stay Tuned

Be sure to sign up for our newsletter to be notified of the follow-up posts in this series!

This article is the first in a series covering the many different features of ArangoML. In the upcoming series, we will showcase topics such as:

Using ArangoML for common network analysis tasks such as discovering structures.
Using ArangoML to develop graph neural networks on graphs stored in ArangoDB.
Using ArangoML for common tasks associated with model management.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.70.0.2

Sep 14, 2022

0.0.70.0.1

Dec 10, 2021

0.0.70.0.0

Dec 2, 2021

0.0.47.4.2

Oct 23, 2019

0.0.47.1

Oct 21, 2019

0.0.47

Oct 14, 2019

0.0.46

Oct 11, 2019

0.0.45

Oct 9, 2019

0.0.44

Oct 8, 2019

0.0.43

Oct 8, 2019

0.0.42

Sep 24, 2019

0.0.41

Sep 24, 2019

0.0.40

Sep 24, 2019

0.0.6.90

Feb 28, 2020

0.0.6.9.5

Oct 30, 2020

0.0.6.9.4

Apr 17, 2020

0.0.6.9.3

Mar 3, 2020

0.0.6.9.1

Mar 2, 2020

0.0.6.9.0

Feb 28, 2020

0.0.6.8.9

Feb 20, 2020

0.0.6.8.6

Feb 13, 2020

0.0.6.8.5

Feb 13, 2020

0.0.6.1

Feb 3, 2020

0.0.5.8

Dec 23, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arangopipe-0.0.70.0.2.tar.gz (18.1 MB view details)

Uploaded Sep 14, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

arangopipe-0.0.70.0.2-py3-none-any.whl (40.8 kB view details)

Uploaded Sep 14, 2022 Python 3

File details

Details for the file arangopipe-0.0.70.0.2.tar.gz.

File metadata

Download URL: arangopipe-0.0.70.0.2.tar.gz
Upload date: Sep 14, 2022
Size: 18.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for arangopipe-0.0.70.0.2.tar.gz
Algorithm	Hash digest
SHA256	`9b0672ab08ec65354519ebc198168393ce0a12a922efe800ef700a8bcc78d67b`
MD5	`f0daad92c535ac4a5c43c9867f8f090d`
BLAKE2b-256	`e20d7119e841821ab0781739295053fa3e192e009060b2f903c75582ab9ffeec`

See more details on using hashes here.

File details

Details for the file arangopipe-0.0.70.0.2-py3-none-any.whl.

File metadata

Download URL: arangopipe-0.0.70.0.2-py3-none-any.whl
Upload date: Sep 14, 2022
Size: 40.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for arangopipe-0.0.70.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4a71a1decd04b8e8db1f89b7f0c588d36993a7cfdd289f6d9294d6291870e755`
MD5	`bc8abbf4d68c23f7f69cf57faf703775`
BLAKE2b-256	`36bcc17f5e8cab8508f742d0abb990866b84646774513d5b890c18c06973f010`

See more details on using hashes here.

arangopipe 0.0.70.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ArangoML Pipeline

Quick Start

Overview

ArangoML Tools/Stack

Use Cases

Learning Through Metadata

Knowledge Graphs

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes