Skip to main content

High-Demensional LASSO_spark

Project description

Hi-LASSO_spark

Hi-LASSO_Spark(High-Demensinal LASSO Spark) is to improve the LASSO solutions for extremely high-dimensional data using pyspark. PySpark is the Python API written in python to support Apache Spark. Apache Spark is a distributed framework that can handle Big Data analysis. Spark is basically a computational engine, that works with huge sets of data by processing them in parallel and batch systems.

Installation

Hi-LASSO_Spark support Python 3.6+, Additionally, you will need numpy, scipy, and glmnet.

Hi-LASSO_spark is available through PyPI and can easily be installed with a pip install::

pip install hi_lasso_spark

Documentation

Read the documentation on readthedocs

Quick Start

# Data load
import pandas as pd
X = pd.read_csv('simulation_data_x.csv')
y = pd.read_csv('simulation_data_y.csv')

# General Usage
from hi_lasso_spark.Hi_LASSO_spark import HiLASSO_Spark

# Create a HiLasso model
model = HiLASSO_Spark(X, y, alpha=0.05, q1='auto', q2='auto', L=30, cv=5, node='auto', logistic=False)

# Fit the model
model.fit()

# Show the coefficients
model.coef_

# Show the p-values
model.p_values_

# Show the selected variable
model.selected_var_

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hi_lasso_spark-1.0.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

hi_lasso_spark-1.0.0-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file hi_lasso_spark-1.0.0.tar.gz.

File metadata

  • Download URL: hi_lasso_spark-1.0.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.7.1 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for hi_lasso_spark-1.0.0.tar.gz
Algorithm Hash digest
SHA256 549b3c8f659142f5e0cf358af7903ab083ab35b78996f1359a89183362ef8290
MD5 ebf55ff15005449a9203238008caa350
BLAKE2b-256 2c70392e24af9b5772f92a9b472977caf17ac3e4e84cdc24ff86c77da8db4282

See more details on using hashes here.

File details

Details for the file hi_lasso_spark-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: hi_lasso_spark-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.7.1 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for hi_lasso_spark-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6f2f1072ddf8d1abf323d726563ca774e326750d81dfa3b8d7800c6c578005ca
MD5 6ca41650abf5f3ad722940b9959a35b6
BLAKE2b-256 24636c7ea8c028d00e99c2842af6b6027a769b13ec5c5f7525b0ff6c93e488ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page