Skip to main content

A series of Widgets for Orange3 to work with Spark ML

Project description

A set of widgets for Orange data mining suite to work with Apache Spark ML API.

Requirements

  • Python >= 3.4

  • Pandas

  • Orange 3

Please follow the instruction to install Orange 3 first.

The main Orange project is hosted at: https://github.com/biolab/orange3 Download from: http://orange.biolab.si

Features

  • A Spark Context.

  • A Hive Table.

  • A Dataframe from an SQL Query.

  • A Dataset Builder, basically a call to VectorAssembler, this is usefull before sending data to Estimators.

  • Transformers from the feature module.

  • Estimators from classification module.

  • Estimators from regression module.

  • Estimators from clustering module.

  • Evaluation from evaluator module.

  • A PySpark script executor + PySpark console.

  • DataFrame transformes for Pandas and Orangle Tables

… more coming soon!

Installing

First, you need to have Apache Spark installed. Follow the instructions here: http://spark.apache.org/docs/latest/

Then you can do:

pip install Orange3-spark

or install the add-on from the Orange’s Options | Add-ons menu. Note, if installing from Add-ons menu, the installation may fail if not all requirements are satisfiable.

If you require ODBC connectivity, you need to install pyodbc (which requires sql.h available if built with pip – that’s unixodbc-dev package on Linux).

If install is ok, you should see a new section in Orange containing a series of widgets from Spark ML API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Orange3-spark-0.2.7.tar.gz (88.5 kB view details)

Uploaded Source

Built Distribution

Orange3_spark-0.2.7-py3-none-any.whl (195.3 kB view details)

Uploaded Python 3

File details

Details for the file Orange3-spark-0.2.7.tar.gz.

File metadata

  • Download URL: Orange3-spark-0.2.7.tar.gz
  • Upload date:
  • Size: 88.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for Orange3-spark-0.2.7.tar.gz
Algorithm Hash digest
SHA256 ff76bb01b94fc460546c7a14fe25df986ba274b4e040a0aaa31780353e4ac376
MD5 c9a41d506ad45143724511d99db20e62
BLAKE2b-256 c864b63833dd76ad49d7015fcc62fd5507917fa0692cbb418458c4b68b6dd38f

See more details on using hashes here.

File details

Details for the file Orange3_spark-0.2.7-py3-none-any.whl.

File metadata

File hashes

Hashes for Orange3_spark-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 d23ed58bf88b3bf8804c2e937d8ffc541ce0e73630717253eab32c85ff62dc6b
MD5 ff0f86e6c2b6bcaafc43dec599607d96
BLAKE2b-256 1cdc33164ce1d90663dde88b3fca770ca3c75c4c7955c2a558263b791f2b1f9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page