Skip to main content

No project description provided

Project description

Python Library for connecting to connect to Elasticsearch, extract data, and upload it to Google Cloud Storage (GCS) in CSV format.

This Python library facilitates the extraction of data from Elasticsearch and uploading it directly to Google Cloud Storage as a CSV file. It is designed to make data migration between Elasticsearch and Google Cloud Storage straightforward by managing the connections and data handling efficiently.

Features

  1. Connect to an Elasticsearch instance and fetch data.
  2. Convert data to a pandas DataFrame and then to a CSV format.
  3. Upload the CSV file directly to a specified Google Cloud Storage bucket.

Installation Install the package via pip:

pip install Elasticsearch_to_GCS_Connector

Dependencies

  1. elasticsearch: To connect and interact with Elasticsearch.
  2. google-cloud-storage: To handle operations related to Google Cloud Storage. pandas: To manage data in DataFrame format.

Make sure to have these installed using:

pip install elasticsearch google-cloud-storage pandas

Example Usage:

from your_library import Elasticsearch_to_GCS_Connector

Elasticsearch_to_GCS_Connector(
    es_index_name='your_index',
    es_host='localhost',
    es_port=port,
    es_scheme='http',
    es_http_auth=('user', 'password'),
    es_size=size,
    gcs_file_name='data.csv',
    gcs_bucket_name='your_bucket_name',
    gcs_bucket_name_prefix='your_prefix'
)

Parameters:

  1. es_index_name (str): The name of the Elasticsearch index to query.
  2. es_host (str): The hostname of the Elasticsearch server.
  3. es_port (int): The port number on which the Elasticsearch server is listening.
  4. es_scheme (str): The protocol scheme (e.g., 'http' or 'https').
  5. es_http_auth (tuple): A tuple containing the username and password for basic authentication.
  6. es_size (int, optional): The number of records to fetch in one query (default is 10000).
  7. gcs_file_name (str): The name of the file to be saved on GCS.
  8. gcs_bucket_name (str): The name of the GCS bucket where the file will be uploaded.
  9. gcs_bucket_name_prefix (str, optional): Prefix for the file name in the bucket, useful for organizing files in folders.

Additional Notes:

Ensure you have configured credentials for both Elasticsearch and Google Cloud:

  1. For Elasticsearch, provide the host, port, scheme, and authentication details.
  2. For Google Cloud Storage, ensure your environment is set up with the appropriate credentials (using Google Cloud SDK or setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to your service account key file).

Project details


Release history Release notifications | RSS feed

This version

0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Elasticsearch_to_GCS_Connector-0.0.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file Elasticsearch_to_GCS_Connector-0.0.tar.gz.

File metadata

File hashes

Hashes for Elasticsearch_to_GCS_Connector-0.0.tar.gz
Algorithm Hash digest
SHA256 706d23dcdfa935ea76ada51449f60e7790c5a07a0f80c7f1c200f5086cc2ec03
MD5 6f918aab8e030f10fb84b5f11f71d531
BLAKE2b-256 c7beb7368fab4ea7a17fc300d99f2773ff40ad9547ac701dcb05ccdb5a5ffad5

See more details on using hashes here.

File details

Details for the file Elasticsearch_to_GCS_Connector-0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for Elasticsearch_to_GCS_Connector-0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eee800853a7ba6b93278ec7c197cf883b376b2f097adaea1ca6891b488d99d62
MD5 0f7b087a9b58ce6a7f3916768dc84d98
BLAKE2b-256 9ee4277b07730bc9bdee66c0977e71e59de59bdb2b1da9d994cda186fd47f333

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page