Skip to main content

A module for ingesting data into Elasticsearch

Project description

📦 ES Ingester

ES Ingester is a Python CLI tool that ingests JSON or JSONL data directly into an Elasticsearch index. It supports multithreading, dynamic JSON extraction, configuration persistence, and optional metadata tagging.


🚀 Features

  • Flexible Input: Accepts JSON and JSONL data from stdin.
  • Dynamic JSON Key Extraction: Supports nested keys (e.g., -json 'data->0->result').
  • Multithreaded Ingestion: Speed up ingestion with customizable thread count.
  • Configuration Persistence: Saves Elasticsearch credentials to a config file for easy reuse.
  • Parent Field Addition: Optional -parent flag allows adding key-value metadata to each document.
  • Verbose Mode: Track progress in real-time.

🔧 Installation

pip install es-ingester

⚙️ Configuration

If .es_ingester_config.yaml already exists in your home directory and contains valid credentials, ES Ingester will use it automatically. The configuration file will be generated automatically the first time credentials are provided.

Example .es_ingester_config.yaml

# ~/.es_ingester_config.yaml

es_host: "http://localhost:9200"
username: "your_username"
password: "your_password"

🛠️ Usage

Ingest JSONL data with saved configuration:

cat data.jsonl | es-ingester -indexname 'my_index' -jsonl
usage: es_ingester [-h] [-es_host ES_HOST] [-username USERNAME] [-password PASSWORD] -indexname INDEXNAME [-threads THREADS] [-json JSON] [-jsonl] [-verbose] [-parent PARENT] [-print PRINT]

Ingest data into Elasticsearch

options:
  -h, --help            show this help message and exit
  -es_host ES_HOST      Elasticsearch host URL
  -username USERNAME    Elasticsearch username
  -password PASSWORD    Elasticsearch password
  -indexname INDEXNAME  Index name to use
  -threads THREADS      Number of threads for ingestion
  -json JSON            Key for JSON extraction (e.g., "result")
  -jsonl                Indicates that stdin contains newline-separated JSON documents
  -verbose              Show progress of document ingestion
  -parent PARENT        Add a key-value pair to each document in the format key:value
  -print PRINT          Specify a key name to print from each document during ingestion

Specify JSON Key for Nested Arrays

Extract nested JSON data by specifying a key path:

cat data.json | es-ingester -indexname 'my_index' -json 'data->0->result'

Add Metadata with Parent Field

Add a domain field with the value example.com to each document:

cat data.jsonl | es-ingester -indexname 'my_index' -jsonl -parent 'domain:example.com'

Full Command with Verbose Output

Ingest JSONL data with a specific host, user, and password, and show progress:

cat data.jsonl | es-ingester -es_host 'http://localhost:9200' -username 'user' -password 'pass' -indexname 'my_index' -jsonl -verbose
  • JSONL vs JSON: Use -jsonl for newline-separated JSON objects or -json to specify a nested key for JSON arrays.
  • Configuration Persistence: If ~/.es_ingester_config.yaml exists, it will be used by default.
  • Parent Field: Adding metadata with -parent is optional. Use key:value format (e.g., -parent 'source:api').

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

es_ingester-0.1.1.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

es_ingester-0.1.1-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file es_ingester-0.1.1.tar.gz.

File metadata

  • Download URL: es_ingester-0.1.1.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for es_ingester-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f311d34355ba2c39b967f5f86d5a238395ec56e02576f276820e54a77d177458
MD5 afeef87179701574231a85217446c6c8
BLAKE2b-256 9fef42b635b96517d28003b8ea2171b279bcc25a48fa142f8b2fa4ccfb442bfb

See more details on using hashes here.

File details

Details for the file es_ingester-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: es_ingester-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for es_ingester-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 217c569c7871604d474156c2e95eb795b392b3468977a2f3b5be86f2f54c0de4
MD5 12db648614fed1144225b9c9b2e7c888
BLAKE2b-256 0515239c7e299ceaa72ef129351c3f21066fa2d35450b8569ceb4b69a832288c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page