This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

es2csv

A CLI tool for exporting data from Elasticsearch into a CSV file

Command line utility, written in Python, for querying Elasticsearch in Lucene query syntax or Query DSL syntax and exporting result as documents into a CSV file. This tool can query bulk docs in multiple indices and get only selected fields, this reduces query execution time.

Quick Look Demo

Installation

>From source:

$ pip install git+https://github.com/taraslayshchuk/es2csv.git

>From pip:

$ pip install es2csv

Usage

$ es2csv [-h] -q QUERY [-u URL] [-a AUTH] [-i INDEX [INDEX ...]]
         [-D DOC_TYPE [DOC_TYPE ...]] [-t TAGS [TAGS ...]] -o FILE
         [-f FIELDS [FIELDS ...]] [-d DELIMITER] [-m INTEGER] [-k]
         [-r] [-e] [-v] [--debug]

Arguments:
 -q, --query QUERY                        Query string in Lucene syntax.               [required]
 -o, --output_file FILE                   CSV file location.                           [required]
 -u, --url URL                            Elasticsearch host URL. Default is http://localhost:9200.
 -a, --auth                               Elasticsearch basic authentication in the form of username:password.
 -i, --index-prefixes INDEX [INDEX ...]   Index name prefix(es). Default is ['logstash-*'].
 -D, --doc_types DOC_TYPE [DOC_TYPE ...]  Document type(s).
 -t, --tags TAGS [TAGS ...]               Query tags.
 -f, --fields FIELDS [FIELDS ...]         List of selected fields in output. Default is ['_all'].
 -d, --delimiter DELIMITER                Delimiter to use in CSV file. Default is ",".
 -m, --max INTEGER                        Maximum number of results to return. Default is 0.
 -k, --kibana_nested                      Format nested fields in Kibana style.
 -r, --raw_query                          Switch query format in the Query DSL.
 -e, --meta_fields                        Add meta-fields in output.
 -v, --version                            Show version and exit.
 --debug                                  Debug mode on.
 -h, --help                               show this help message and exit

Examples

Searching on localhost and save to database.csv

$ es2csv -q 'host: localhost' -o database.csv

Same in Query DSL syntax

$ es2csv -r -q '{"query": {"match": {"host": "localhost"}}}' -o database.csv

Very long queries can be read from file

$ es2csv -r -q @'~/query string file.json' -o database.csv

With tag

$ es2csv -t dev -q 'host: localhost' -o database.csv

More tags

$ es2csv -t dev prod -q 'host: localhost' -o database.csv

On custom Elasticsearch host

$ es2csv -u my.cool.host.com:9200 -q 'host: localhost' -o database.csv

You are using secure Elasticsearch with nginx? No problem!

$ es2csv -u http://my.cool.host.com/es/ -q 'host: localhost' -o database.csv

Not default port?

$ es2csv -u my.cool.host.com:6666/es/ -q 'host: localhost' -o database.csv

With Authorization

$ es2csv -u http://login:password@my.cool.host.com:6666/es/ -q 'host: localhost' -o database.csv

With explicit Authorization

$ es2csv -a login:password -u http://my.cool.host.com:6666/es/ -q 'host: localhost' -o database.csv

Specifying index

$ es2csv -i logstash-2015-07-07 -q 'host: localhost' -o database.csv

More indexes

$ es2csv -i logstash-2015-07-07 logstash-2015-08-08 -q 'host: localhost' -o database.csv

Or index mask

$ es2csv -i logstash-2015-* -q 'host: localhost' -o database.csv

And now together

$ es2csv -i logstash-2015-01-0* logstash-2015-01-10 -q 'host: localhost' -o database.csv

Collecting all data on all indices

$ es2csv -i _all -q '*' -o database.csv

Specifying document type

$ es2csv -D log -i _all -q '*' -o database.csv

Selecting some fields, what you are interesting in, if you don’t need all of them (query run faster)

$ es2csv -f host status date -q 'host: localhost' -o database.csv

Selecting all fields, by default

$ es2csv -f _all -q 'host: localhost' -o database.csv

Selecting meta-fields: _id, _index, _score, _type

$ es2csv -e -f _all -q 'host: localhost' -o database.csv

Selecting nested fields

$ es2csv -f comments.comment comments.date comments.name -q '*' -i twitter -o database.csv

Max results count

$ es2csv -m 6283185 -q '*' -i twitter -o database.csv

Changing column delimiter in CSV file, by default ‘,’

$ es2csv -d ';' -q '*' -i twitter -o database.csv

Changing nested columns output format to Kibana style like

$ es2csv -k -q '*' -i twitter -o database.csv

An JSON document example

{
  "title": "Nest eggs",
  "body":  "Making your money work...",
  "tags":  [ "cash", "shares" ],
  "comments": [
    {
      "name":    "John Smith",
      "comment": "Great article",
      "age":     28,
      "stars":   4,
      "date":    "2014-09-01"
    },
    {
      "name":    "Alice White",
      "comment": "More like this please",
      "age":     31,
      "stars":   5,
      "date":    "2014-10-22"
    }
  ]
}

A CSV file in Kibana style format

body,comments.age,comments.comment,comments.date,comments.name,comments.stars,tags,title
Making your money work...,"28,31","Great article,More like this please","2014-09-01,2014-10-22","John Smith,Alice White","4,5","cash,shares",Nest eggs

A CSV file in default format

body,comments.0.age,comments.0.comment,comments.0.date,comments.0.name,comments.0.stars,comments.1.age,comments.1.comment,comments.1.date,comments.1.name,comments.1.stars,tags.0,tags.1,title
Making your money work...,28,Great article,2014-09-01,John Smith,4,31,More like this please,2014-10-22,Alice White,5,cash,shares,Nest eggs

Release Changelog

2.4.1 (2016-11-10)

  • Added –auth(-a) argument for Elasticsearch basic authentication. (Pull #17)
  • Added –doc_types(-D) argument for specifying document type. (Pull #13)

2.4.0 (2016-10-26)

  • Added JSON validation for raw query. (Issue #7)
  • Added checks to exclude hangs during connection issues. (Issue #9)
  • Updating version elasticsearch-py to 2.4.0 and freeze this dependence according to mask 2.4.*. (Issue #14)
  • Updating version progressbar2 to fix issue with visibility.

1.0.3 (2016-06-12)

  • Added option to read query string from file –query(-q) @’~/filename.json’. (Issue #5)
  • Added –meta_fields(-e) argument for selecting meta-fields: _id, _index, _score, _type. (Issue #6)
  • Updating version elasticsearch-py to 2.3.0.

1.0.2 (2016-04-12)

  • Added –raw_query(-r) argument for using the native Query DSL format.

1.0.1 (2016-01-22)

  • Fixed support elasticsearch-1.4.0.
  • Added –version argument.
  • Added history changelog.

1.0.0.dev1 (2016-01-04)

  • Fixed encoding in CSV to UTF-8. (Issue #3, Pull #1)
  • Added better progressbar unit names. (Pull #2)
  • Added pip installation instruction.

1.0.0.dev0 (2015-12-25)

  • Initial registration.
  • Added first dev-release on github.
  • Added first release on PyPI.
Release History

Release History

2.4.1

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

2.4.0

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

1.0.3

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

1.0.2

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

1.0.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

1.0.0.dev1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

1.0.0.dev0

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
es2csv-2.4.1-py2-none-any.whl (11.4 kB) Copy SHA256 Checksum SHA256 py2 Wheel Nov 10, 2016
es2csv-2.4.1.tar.gz (8.2 kB) Copy SHA256 Checksum SHA256 Source Nov 10, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting