Skip to main content

SparkAI on CLI

Project description

PySparkAI CLI

SparkAI on CLI

Installation

Prerequisites

Java JDK 8 is required as a dependency of spark/pyspark itself. Make sure to have the JAVA_HOME environment variable setup as well.

If your environment is already configured to run pyspark applications, you are good to go.

Setup your environment

Setup OpenAI API key in environment variables:

export OPENAI_API_KEY='sk-...'

To use Google's search mechanism to find data on web, you must also setup Google API key in environment variables:

export GOOGLE_API_KEY='...'

Install pyspark-ai-cli

pip install git+https://github.com/lucas-lm/spark-ai-cli

Usage

Call CLI in your shell

python -m pyspark-ai "https://github.com/topics/google --limit=20"

Applying transformations over the source data:

pyspark-ai https://github.com/topics/google --transform "top 3 python repos with more stars"

By default the LLM used behind the scenes is gpt-3.5-turbo, but you can change it with --gpt-model-name flag:

pyspark-ai "https://github.com/topics/google" --transform "show me programming languages by stars from the most stared to the less stared" --gpt-model-name "gpt-4" --limit 20

Only OpenAI's LLMs are supported in the current version.

Warning

GPT-4 may be not be generally available, so you may face issues on it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_ai_cli-0.1.0.tar.gz (2.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spark_ai_cli-0.1.0-py3-none-any.whl (2.9 kB view details)

Uploaded Python 3

File details

Details for the file spark_ai_cli-0.1.0.tar.gz.

File metadata

  • Download URL: spark_ai_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 2.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.6 Linux/5.15.90.1-microsoft-standard-WSL2

File hashes

Hashes for spark_ai_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6f023e4f4c0b3bf46e352fe82d1f9fe803c6dce6ffc1d9d36f85d7d26d6f8258
MD5 7446d9aa3b532ba947d8397158b29154
BLAKE2b-256 9ee6b688a9b931039135322292c3bd69faa01fbf47080fd63fc62eec057e64d6

See more details on using hashes here.

File details

Details for the file spark_ai_cli-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spark_ai_cli-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.6 Linux/5.15.90.1-microsoft-standard-WSL2

File hashes

Hashes for spark_ai_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 250b69a177855be037b226997dbc5971e505e5b27e63c09c3e6a0265c4ebb530
MD5 82b4554cb7486d3a706918a4e47344a8
BLAKE2b-256 6f1bd6301e9673044f70a7a24c95a27bdca5d1703f126b345494e53757048a60

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page