Skip to main content

A module for ingesting data into Elasticsearch

Project description

📦 ES Ingester

ES Ingester is a Python CLI tool that ingests JSON or JSONL data directly into an Elasticsearch index. It supports multithreading, dynamic JSON extraction, configuration persistence, and optional metadata tagging.


🚀 Features

  • Flexible Input: Accepts JSON and JSONL data from stdin.
  • Dynamic JSON Key Extraction: Supports nested keys (e.g., -json 'data->0->result').
  • Multithreaded Ingestion: Speed up ingestion with customizable thread count.
  • Configuration Persistence: Saves Elasticsearch credentials to a config file for easy reuse.
  • Parent Field Addition: Optional -parent flag allows adding key-value metadata to each document.
  • Verbose Mode: Track progress in real-time.

🔧 Installation

pip install es-ingester

⚙️ Configuration

If .es_ingester_config.yaml already exists in your home directory and contains valid credentials, ES Ingester will use it automatically. The configuration file will be generated automatically the first time credentials are provided.

Example .es_ingester_config.yaml

# ~/.es_ingester_config.yaml

es_host: "http://localhost:9200"
username: "your_username"
password: "your_password"

🛠️ Usage

Ingest JSONL data with saved configuration:

cat data.jsonl | es-ingester -indexname 'my_index' -jsonl

Specify JSON Key for Nested Arrays

Extract nested JSON data by specifying a key path:

cat data.json | es-ingester -indexname 'my_index' -json 'data->0->result'

Add Metadata with Parent Field

Add a domain field with the value example.com to each document:

cat data.jsonl | es-ingester -indexname 'my_index' -jsonl -parent 'domain:example.com'

Full Command with Verbose Output

Ingest JSONL data with a specific host, user, and password, and show progress:

cat data.jsonl | es-ingester -es_host 'http://localhost:9200' -username 'user' -password 'pass' -indexname 'my_index' -jsonl -verbose
  • JSONL vs JSON: Use -jsonl for newline-separated JSON objects or -json to specify a nested key for JSON arrays.
  • Configuration Persistence: If ~/.es_ingester_config.yaml exists, it will be used by default.
  • Parent Field: Adding metadata with -parent is optional. Use key:value format (e.g., -parent 'source:api').

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

es_ingester-0.1.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

es_ingester-0.1.0-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file es_ingester-0.1.0.tar.gz.

File metadata

  • Download URL: es_ingester-0.1.0.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for es_ingester-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3ce05ca7f709668c70fb9a0b1fd671428d390eb82831768ce8e33014cb73dfee
MD5 6ba766008e37346d7287ce748c380177
BLAKE2b-256 474df28d43d259a41b9fb727b206c2d7cccc65f09bc0b24eeeeaf13aba4a0833

See more details on using hashes here.

File details

Details for the file es_ingester-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: es_ingester-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for es_ingester-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b129cadb99a70408ed8c982ac9725024da6b31e308868ce3c7f67109fa0d6f02
MD5 c371247bc6bf0c5d7de6759a173cd25d
BLAKE2b-256 b61afd561fcc9343f86e793db8a75d5211e22afdaee80b0121dc463af0832cac

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page