From Knowledge Graphs to Machine Learning!

These details have not been verified by PyPI

Project description

Welcome to the SparkKG-ML Documentation

Welcome to the documentation for sparkKG-ML, a Python library designed to facilitate machine learning with Spark on semantic web and knowledge graph data.

SparkKG-ML is specifically built to bridge the gap between the semantic web data model and the powerful distributed computing capabilities of Apache Spark. By leveraging the flexibility of semantic web and the scalability of Spark, sparkKG-ML empowers you to extract meaningful insights and build robust machine learning models on semantic web and knowledge graph datasets.

You can find the detailed documentaion of sparkKG-ML here. This documentation serves as a comprehensive guide to understanding and effectively utilizing sparkKG-ML. Here, you will find detailed explanations of the library's core concepts, step-by-step tutorials to get you started, and a rich collection of code examples to illustrate various use cases.

Key Features of sparkKG-ML:

Seamless Integration: SparkKG-ML seamlessly integrates with Apache Spark, providing a unified and efficient platform for distributed machine learning on semantic web and knowledge graph data.
Data Processing: With SparkKG-ML, you can easily preprocess semantic web data, handle missing values, perform feature engineering, and transform your data into a format suitable for machine learning.
Scalable Machine Learning: SparkKG-ML leverages the distributed computing capabilities of Spark to enable scalable and parallel machine learning on large semantic web and knowledge graph datasets.
Advanced Algorithms: SparkKG-ML provides a wide range of machine learning algorithms specifically designed for semantic web and knowledge graph data, allowing you to tackle complex tasks within the context of knowledge graphs and the semantic web.
Extensibility: SparkKG-ML is designed to be easily extended, allowing you to incorporate your own custom algorithms and techniques seamlessly into the library.

We hope this documentation proves to be a valuable resource as you explore the capabilities of sparkKG-ML and embark on your journey of machine learning with Spark on semantic web and knowledge graph data. Happy learning!

Installation Guide

This guide provides step-by-step instructions on how to install the sparkKG-ML library. sparkKG-ML can be installed using pip or by installing from the source code.

Installation via pip:

To install sparkKG-ML using pip, follow these steps:

Open a terminal or command prompt.
Run the following command to install the latest stable version of sparkKG-ML:
```
   
   pip install sparkkgml
```

This will download and install sparkKG-ML and its dependencies.

Once the installation is complete, you can import sparkKG-ML into your Python projects and start using it for machine learning on semantic web and knowledge graph data.

Installation from source:

To install sparkKG-ML from the source code, follow these steps:

Clone the sparkKG-ML repository from GitHub using the following command:
```
   git clone https://github.com/IDIASLab/SparkKG-ML
```

This will create a local copy of the sparkKG-ML source code on your machine.

Change into the sparkKG-ML directory:
```
   cd sparkkgml
```
Run the following command to install sparkKG-ML and its dependencies:
```
   pip install .
```

This will install sparkKG-ML using the source code in the current directory.

Once the installation is complete, you can import sparkKG-ML into your Python projects and start using it for machine learning on semantic web and knowledge graph data.

Congratulations! You have successfully installed the sparkKG-ML library. You are now ready to explore the capabilities of sparkKG-ML and leverage its machine learning functionalities.

For more details on how to use sparkKG-ML, please refer to the documentation.

Getting Started

Let's start with a basic example, we will retrieve data from a SPARQL endpoint and convert it into a Spark DataFrame using the getDataFrame function.

```python
    # Import the required libraries
    from sparkkgml.data_acquisition import DataAcquisition

    # Create an instance of KgQuery
    DataAcquisitionObject = DataAcquisition()

    # Specify the SPARQL endpoint and query
    endpoint = "https://recipekg.arcc.albany.edu/RecipeKG"
    query ="""
        PREFIX schema: <https://schema.org/>
        PREFIX recipeKG:<http://purl.org/recipekg/>
        SELECT  ?recipe
        WHERE { ?recipe a schema:Recipe. }
        LIMIT 3
        """

    # Retrieve the data as a Spark DataFrame
    spark_df = DataAcquisitionObject.getDataFrame(endpoint=endpoint, query=query)
    spark_df.show()
```

+------------------------------------------+
| recipe                                   |
+==========================================+
| recipeKG:recipe/peanut-butter-tandy-bars |
+------------------------------------------+
| recipeKG:recipe/the-best-oatmeal-cookies |
+------------------------------------------+
| recipeKG:recipe/peach-cobbler-ii         |
+------------------------------------------+

The getDataFrame function will query the data from the specified SPARQL endpoint and return a Spark DataFrame that you can use for further analysis or machine learning tasks.

License

SparkKG-ML was created by IDIAS Lab. It is licensed under the terms of the Apache License 2.0 license.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.2

Aug 29, 2024

0.2.1

Aug 27, 2024

0.2

Aug 26, 2024

0.1.12

Dec 12, 2023

0.1.11

Dec 8, 2023

0.1.10

Dec 8, 2023

0.1.9

Dec 8, 2023

0.1.8

Oct 6, 2023

0.1.7

Sep 18, 2023

0.1.6

Sep 18, 2023

0.1.5

Sep 7, 2023

0.1.4

Sep 5, 2023

This version

0.1.3

Jul 17, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparkkgml-0.1.3.tar.gz (14.7 kB view hashes)

Uploaded Jul 17, 2023 Source

Built Distribution

sparkkgml-0.1.3-py3-none-any.whl (15.8 kB view hashes)

Uploaded Jul 17, 2023 Python 3

Hashes for sparkkgml-0.1.3.tar.gz

Hashes for sparkkgml-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`0d24606404bc31b64777ec85a3c3b8833dc77f31126c65bd2d6f19b79866ebef`
MD5	`455e4e6535cfc37d888e1c4060a8598e`
BLAKE2b-256	`74ebf1e5e385f7855e74508c17e8c35ef64e5b0dd24e24aeaf9be83423ddbf99`

Hashes for sparkkgml-0.1.3-py3-none-any.whl

Hashes for sparkkgml-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b9d539055933413eccf13b9479d7f4b43a4e92e94a6f173727d23898fc501ea1`
MD5	`0d551975bf4bc3db4c1eca58691704fc`
BLAKE2b-256	`eeb44c4d524429520fa08cc9ce14f6cc4e74362a42f0148f7e0e4d64961cf224`