Skip to main content

No project description provided

Project description

Python Library for connecting to Elasticsearch and loading data into BigQuery.

This Python library provides utilities to extract data from Elasticsearch and load it directly into Google BigQuery. It simplifies the process of data migration between Elasticsearch and BigQuery by handling connection setup, data extraction, and data loading with optional timestamping.

Features

  1. Connect to an Elasticsearch instance and fetch data.
  2. Load data directly into a specified BigQuery table.
  3. Optional timestamping for record insertion.

Installation Install the package via pip:

pip install Elasticsearch_to_BigQuery_Connector

Dependencies

  1. elasticsearch: To connect and interact with Elasticsearch.
  2. google-cloud-bigquery: To handle operations related to BigQuery.

Make sure to have these installed using:

pip install elasticsearch google-cloud-bigquery

Example Usage:

from Elasticsearch_to_BigQuery_Connector import Elasticsearch_to_BigQuery_Connector

Elasticsearch_to_BigQuery_Connector(
    es_index_name='your_index',
    es_host='localhost',
    es_port=port,
    es_scheme='http',
    es_http_auth=('user', 'pass'),
    es_size=size,
    bq_project_id='your_project_id',
    bq_dataset_id='your_dataset_id',
    bq_table_name='your_table_name',
    bq_add_record_addition_time=True
)

Parameters:

  1. index_name (str): The name of the Elasticsearch index to query.
  2. host (str): The hostname of the Elasticsearch server.
  3. port (int): The port number on which the Elasticsearch server is listening.
  4. scheme (str): The protocol scheme (e.g., 'http' or 'https').
  5. http_auth (tuple): A tuple containing the username and password for basic auth.
  6. size (int, optional): The number of records to fetch in one query (default is 10000).
  7. bq_project_id (str): The Google Cloud project ID.
  8. bq_dataset_id (str): The dataset ID within the Google Cloud project.
  9. bq_table_name (str): The table name where the data will be loaded.
  10. bq_add_record_addition_time (bool): If True, adds the current datetime as landloaddate to each record.

Additional Notes:

Ensure you have configured credentials for both Elasticsearch and Google Cloud (BigQuery):

  1. For Elasticsearch, provide the host, port, scheme, and authentication details.
  2. For BigQuery, ensure your environment is set up with the appropriate credentials (using Google Cloud SDK or setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to your service account key file).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

File details

Details for the file Elasticsearch_to_BigQuery_Connector-0.1.tar.gz.

File metadata

File hashes

Hashes for Elasticsearch_to_BigQuery_Connector-0.1.tar.gz
Algorithm Hash digest
SHA256 c1dd47cbe801f254da11498d0caab7ea33d2a5a55aadb2ad95ae180f490cf635
MD5 dad640d4a91dfc3fac220f074422895c
BLAKE2b-256 006218050ac1a634d608c65849a10ab59e3709b835a395516af2df763b38e3a7

See more details on using hashes here.

File details

Details for the file Elasticsearch_to_BigQuery_Connector-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for Elasticsearch_to_BigQuery_Connector-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0be3ef118a2b5a4c97a683e47570d7fc4fa74c09cd98fdc4b79c4a0d4a1ed7d4
MD5 a45067c89506d1dfd81dd426a22f7a0b
BLAKE2b-256 031d877fb72d9ab10b13fcf1c4374bb514c0377442d1110b2701d7dc786872b5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page