No project description provided
Project description
Python Library for connecting to Elasticsearch and loading data into BigQuery.
This Python library provides utilities to extract data from Elasticsearch and load it directly into Google BigQuery. It simplifies the process of data migration between Elasticsearch and BigQuery by handling connection setup, data extraction, and data loading with optional timestamping.
Features
- Connect to an Elasticsearch instance and fetch data.
- Load data directly into a specified BigQuery table.
- Optional timestamping for record insertion.
Installation Install the package via pip:
pip install Elasticsearch_to_BigQuery_Connector
Dependencies
- elasticsearch: To connect and interact with Elasticsearch.
- google-cloud-bigquery: To handle operations related to BigQuery.
Make sure to have these installed using:
pip install elasticsearch google-cloud-bigquery
Example Usage:
from Elasticsearch_to_BigQuery_Connector import Elasticsearch_to_BigQuery_Connector
Elasticsearch_to_BigQuery_Connector(
es_index_name='your_index',
es_host='localhost',
es_port=port,
es_scheme='http',
es_http_auth=('user', 'pass'),
es_size=size,
bq_project_id='your_project_id',
bq_dataset_id='your_dataset_id',
bq_table_name='your_table_name',
bq_add_record_addition_time=True
)
Parameters:
- index_name (str): The name of the Elasticsearch index to query.
- host (str): The hostname of the Elasticsearch server.
- port (int): The port number on which the Elasticsearch server is listening.
- scheme (str): The protocol scheme (e.g., 'http' or 'https').
- http_auth (tuple): A tuple containing the username and password for basic auth.
- size (int, optional): The number of records to fetch in one query (default is 10000).
- bq_project_id (str): The Google Cloud project ID.
- bq_dataset_id (str): The dataset ID within the Google Cloud project.
- bq_table_name (str): The table name where the data will be loaded.
- bq_add_record_addition_time (bool): If True, adds the current datetime as landloaddate to each record.
Additional Notes:
Ensure you have configured credentials for both Elasticsearch and Google Cloud (BigQuery):
- For Elasticsearch, provide the host, port, scheme, and authentication details.
- For BigQuery, ensure your environment is set up with the appropriate credentials (using Google Cloud SDK or setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to your service account key file).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file Elasticsearch_to_BigQuery_Connector-0.1.tar.gz
.
File metadata
- Download URL: Elasticsearch_to_BigQuery_Connector-0.1.tar.gz
- Upload date:
- Size: 3.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1dd47cbe801f254da11498d0caab7ea33d2a5a55aadb2ad95ae180f490cf635 |
|
MD5 | dad640d4a91dfc3fac220f074422895c |
|
BLAKE2b-256 | 006218050ac1a634d608c65849a10ab59e3709b835a395516af2df763b38e3a7 |
File details
Details for the file Elasticsearch_to_BigQuery_Connector-0.1-py3-none-any.whl
.
File metadata
- Download URL: Elasticsearch_to_BigQuery_Connector-0.1-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0be3ef118a2b5a4c97a683e47570d7fc4fa74c09cd98fdc4b79c4a0d4a1ed7d4 |
|
MD5 | a45067c89506d1dfd81dd426a22f7a0b |
|
BLAKE2b-256 | 031d877fb72d9ab10b13fcf1c4374bb514c0377442d1110b2701d7dc786872b5 |