Tools for exporting Bitcoin blockchain data to JSON
Project description
Bitcoin ETL
Install Bitcoin ETL:
pip install bitcoin-etl
Export blocks and transactions (Schema, Reference):
> bitcoinetl export_blocks_and_transactions --start-block 0 --end-block 500000 \
--provider-uri http://user:pass@localhost:8332 --chain bitcoin \
--blocks-output blocks.json --transactions-output transactions.json
Supported chains:
- bitcoin
- bitcoin_cash
- dogecoin
- litecoin
- dash
- zcash
For the latest version, check out the repo and call
> pip install -e .
> python bitcoinetl.py
Table of Contents
Schema
blocks.json
Field | Type |
---|---|
hash | hex_string |
size | bigint |
stripped_size | bigint |
weight | bigint |
number | bigint |
version | bigint |
merkle_root | hex_string |
timestamp | bigint |
nonce | hex_string |
bits | hex_string |
coinbase_param | hex_string |
transaction_count | bigint |
transactions.json
Field | Type |
---|---|
hash | hex_string |
size | bigint |
virtual_size | bigint |
version | bigint |
lock_time | bigint |
block_number | bigint |
block_hash | hex_string |
block_timestamp | bigint |
is_coinbase | boolean |
input_count | bigint |
output_count | bigint |
inputs | []transaction_input |
outputs | []transaction_output |
transaction_input
Field | Type |
---|---|
index | bigint |
spent_transaction_hash | hex_string |
spent_output_index | bigint |
script_asm | string |
script_hex | hex_string |
sequence | bigint |
addresses | []string |
value | bigint |
transaction_output
Field | Type |
---|---|
index | bigint |
script_asm | string |
script_hex | hex_string |
required_signatures | bigint |
type | string |
addresses | []string |
value | bigint |
You can find column descriptions in schemas
Notes:
-
Output values returned by Dogecoin API had precision loss in the clients prior to version 1.14. It's caused by this issue https://github.com/dogecoin/dogecoin/issues/1558
The explorers that used older versions to export the data may show incorrect address balances and transaction amounts. -
For Zcash,
vjoinsplit
andvalueBalance
fields are converted to inputs and outputs with type 'shielded' https://zcash-rpc.github.io/getrawtransaction.html, https://zcash.readthedocs.io/en/latest/rtd_pages/zips/zip-0243.html
Exporting the Blockchain
-
Install python 3.5.3+ https://www.python.org/downloads/
-
Install Bitcoin node https://hackernoon.com/a-complete-beginners-guide-to-installing-a-bitcoin-full-node-on-linux-2018-edition-cb8e384479ea
-
Start Bitcoin. Make sure it downloaded the blocks that you need by executing
$ bitcoin-cli getblockchaininfo
in the terminal. You can export blocks belowblocks
, there is no need to wait until the full sync -
Install Bitcoin ETL:
> pip install bitcoin-etl
-
Export blocks & transactions:
> bitcoinetl export_all --start 0 --end 499999 \ --partition-batch-size 100 \ --provider-uri http://user:pass@localhost:8332 --chain bitcoin
The result will be in the
output
subdirectory, partitioned in Hive style:output/blocks/start_block=00000000/end_block=00000099/blocks_00000000_00000099.csv output/blocks/start_block=00000100/end_block=00000199/blocks_00000100_=00000199.csv ... output/transactions/start_block=00000000/end_block=00000099/transactions_00000000_00000099.csv ...
In case
bitcoinetl
command is not available in PATH, usepython -m bitcoinetl
instead.
Running in Docker
-
Install Docker https://docs.docker.com/install/
-
Build a docker image
> docker build -t bitcoin-etl:latest . > docker image ls
-
Run a container out of the image
> docker run -v $HOME/output:/bitcoin-etl/output bitcoin-etl:latest export_blocks_and_transactions --start-block 0 --end-block 500000 \ --rpc-pass '' --rpc-host 'localhost' --rpc-user '' --blocks-output blocks.json --transactions-output transactions.json
Command Reference
All the commands accept -h
parameter for help, e.g.:
> bitcoinetl export_blocks_and_transactions --help
Usage: bitcoinetl.py export_blocks_and_transactions [OPTIONS]
Export blocks and transactions.
Options:
-s, --start-block INTEGER Start block
-e, --end-block INTEGER End block [required]
-b, --batch-size INTEGER The number of blocks to export at a time.
-p, --provider-uri TEXT The URI of the remote Bitcoin node
-w, --max-workers INTEGER The maximum number of workers.
--blocks-output TEXT The output file for blocks. If not provided
blocks will not be exported. Use "-" for stdout
--transactions-output TEXT The output file for transactions. If not
provided transactions will not be exported. Use
"-" for stdout
--help Show this message and exit.
For the --output
parameters the supported type is json. The format type is inferred from the output file name.
export_blocks_and_transactions
> bitcoinetl export_blocks_and_transactions --start-block 0 --end-block 500000 \
--provider-uri http://user:pass@localhost:8332 \
--blocks-output blocks.json --transactions-output transactions.json
Omit --blocks-output
or --transactions-output
options if you want to export only transactions/blocks.
You can tune --batch-size
, --max-workers
for performance.
get_block_range_for_date
> bitcoinetl get_block_range_for_date --provider-uri http://user:pass@localhost:8332 --date=2017-03-01
This command is guaranteed to return the block range that covers all blocks with block.time
on the specified
date. However the returned block range may also contain blocks outside the specified date, because block times are not
monotonic https://twitter.com/EvgeMedvedev/status/1073844856009576448. You can filter
blocks.json
/transactions.json
with the below command:
> bitcoinetl filter_items -i blocks.json -o blocks_filtered.json \
-p "datetime.datetime.fromtimestamp(item['timestamp']).astimezone(datetime.timezone.utc).strftime('%Y-%m-%d') == '2017-03-01'"
export_all
> bitcoinetl export_all --provider-uri http://user:pass@localhost:8332 --start 2018-01-01 --end 2018-01-02
You can tune --export-batch-size
, --max-workers
for performance.
Running Tests
> pip install -e .[dev]
> echo "The below variables are optional"
> export BITCOINETL_BITCOIN_PROVIDER_URI=http://user:pass@localhost:8332
> export BITCOINETL_LITECOIN_PROVIDER_URI=http://user:pass@localhost:8331
> export BITCOINETL_DOGECOIN_PROVIDER_URI=http://user:pass@localhost:8330
> export BITCOINETL_BITCOIN_CASH_PROVIDER_URI=http://user:pass@localhost:8329
> export BITCOINETL_DASH_PROVIDER_URI=http://user:pass@localhost:8328
> export BITCOINETL_ZCASH_PROVIDER_URI=http://user:pass@localhost:8327
> pytest -vv
Running Tox Tests
> pip install tox
> tox
Public Datasets in BigQuery
Coming Soon...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.