Skip to main content

A package for preprocessing data for AI models

Project description

BigFlowPrototypes

A package for preprocessing data for AI models.

Modules

Image.py

Processes image data using VGG16 to convert it to vector data and store it in a database.

linearEmbed.py

Prepares a "normal dataset" and converts non-numeric columns to embedded data using OpenAI embedding.

Linear.py

Removes non-numeric data and inputs it into a model.

Installation

pip install BigFlowPrototypes


from data_preprocessor import linearEmbed

if __name__ == "__main__":
    input_data_path = 'titanic.csv'  # Path to the input data file
    label_column = 'Survived'  # Name of the label column
    columns_to_embed = ['Name']  # Columns to embed
    openai_api_key = 'your_openai_api_key'  # OpenAI API key
    
    X_train, X_test, y_train, y_test = linearEmbed.main(input_data_path, label_column, columns_to_embed, openai_api_key)
    print("Data processing complete.")


from data_preprocessor import Linear

if __name__ == "__main__":
    input_file = 'movies.csv'  # Path to the input data file
    label = 'Popularity'  # Name of the label column
    
    X_train, X_test, y_train, y_test = Linear.main(input_file, label)
    print("Data processing complete.")

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigflowprototypes-0.1.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

BigFlowPrototypes-0.1-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file bigflowprototypes-0.1.tar.gz.

File metadata

  • Download URL: bigflowprototypes-0.1.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.10

File hashes

Hashes for bigflowprototypes-0.1.tar.gz
Algorithm Hash digest
SHA256 147b819546ef043cd3e5aff7fd2ae8f9bb46cdb6f70566798ab31e5fc97ff6a9
MD5 44db1167c9b4ffb4556115eb5a52ec87
BLAKE2b-256 069d54d0d4a56c369a1d6ff3a79535d2852a8de140798579ce1eb52aff2b834b

See more details on using hashes here.

File details

Details for the file BigFlowPrototypes-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for BigFlowPrototypes-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 acdb41bed95c6cbae040842f170c98345dfb141f0c775ac0a86387ba97b72a34
MD5 b04a99dcc9311a1c5e33763f17c4ebd4
BLAKE2b-256 799cc665fe61c22dc33f4b2b62f9a8d5f9ecd44feaf758868d8f25339a9ae7db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page