Collector framework to collect data from running Dwellir server nodes.
Project description
Dwellir Harvester
An extensible Python tool to collect metadata from local blockchain nodes and output a JSON file that conforms to a shared JSON Schema.
All timestamps use RFC 3339 / ISO 8601 with timezone.
System Requirements
- Python 3.9 or higher
- Systemd (for running as a service)
Installation
This repo depends on the shared SDK dwellir-harvester-lib. The default dependency points to a sibling checkout at ../dwellir-harvester-lib; adjust pyproject.toml if yours lives elsewhere or you install it from a package index.
From Source
-
Clone the repository:
git clone https://github.com/your-org/dwellir-harvester.git cd dwellir-harvester
-
Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install the package in development mode:
pip install -e .
System-wide Installation
For production use, you can install the package system-wide:
sudo pip install .
Quick Start
Run a Single Collection
# Run with the safe default "host" collector (basic system information)
dwellir-harvester collect host null --output out.json
Run the Daemon
The harvester can run as a daemon that periodically collects data and serves it via HTTP:
# Start the daemon (runs in foreground)
dwellir-harvester-daemon
By default, the daemon:
- Runs on
0.0.0.0:18080 - Collects data every 5 minutes
- Uses the "host" collector
- Validates output against the schema
Secure the Daemon with Tokens
The daemon can require a bearer token for all endpoints. Auth is disabled by default; set tokens to enable it.
- Single or multiple tokens via CLI:
dwellir-harvester-daemon --auth-token secret1 --auth-token secret2
- Environment variable:
DAEMON_AUTH_TOKENS=secret1,secret2 dwellir-harvester-daemon
- Token file (JSON/YAML list of
{ "token": "...", "label": "client-name", "enabled": true }):dwellir-harvester-daemon --auth-token-file /path/tokens.json # or DAEMON_AUTH_TOKEN_FILE=/path/tokens.json dwellir-harvester-daemon
Requests must send Authorization: Bearer <token> (or X-Auth-Token). Invalid/missing tokens get 401 Unauthorized with WWW-Authenticate: Bearer.
Tip: Put the token file somewhere readable by the daemon user, and omit raw secrets from logs—only labels are logged on failures.
Configuration
Command Line Arguments
usage: dwellir-harvester-daemon [-h] [--collectors COLLECTORS [COLLECTORS ...]] [--host HOST] [--port PORT] [--debug]
[--interval INTERVAL] [--schema SCHEMA] [--auth-token AUTH_TOKENS] [--auth-token-file AUTH_TOKEN_FILE]
[--collector-path COLLECTOR_PATH]
[--no-validate] [--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
Dwellir Harvester Daemon
options:
-h, --help show this help message and exit
--collectors COLLECTORS [COLLECTORS ...]
List of collectors to run (default: ['host'])
--host HOST Host to bind the HTTP server to (default: 0.0.0.0)
--port PORT Port to run the HTTP server on (default: 18080)
--interval INTERVAL Collection interval in seconds (default: 300)
--schema SCHEMA Path to JSON schema file (defaults to bundled schema)
--auth-token AUTH_TOKENS
Bearer token to require for HTTP access (can be specified multiple times)
--auth-token-file AUTH_TOKEN_FILE
Path to JSON/YAML file containing token entries: [{"token": "...", "label": "...", "enabled": true}]
--collector-path COLLECTOR_PATH
Additional paths to search for collectors (can be repeated). Also honors HARVESTER_COLLECTOR_PATHS.
--no-validate Disable schema validation
--debug Enable debug logging
--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Logging level (default: INFO)
Environment Variables
DATA_DIR: Directory to store data files (default:/var/lib/dwellir-harvester)LOG_LEVEL: Logging level (default:INFO)PORT: HTTP server port (default:18080)INTERVAL: Collection interval in seconds (default:300)COLLECTORS: Space-separated list of collectors to run (default:host)VALIDATE: Enable/disable schema validation (default:true)DEBUG: Enable debug logging (default:false)DAEMON_AUTH_TOKENS: Comma-separated list of bearer tokensDAEMON_AUTH_TOKEN_FILE: Path to token file (JSON/YAML list of{token,label,enabled})HARVESTER_COLLECTOR_PATHS: Path list (os.pathsep-separated) to search for plugin collectors
Plugin collectors
- Add your collector module to a directory and point the CLI/daemon to it:
dwellir-harvester collect sample_plugin --collector-path ./examples/plugins
- You can also set
HARVESTER_COLLECTOR_PATHS=./examples/pluginsto make the paths available without flags. - Run a collector class directly (SDK runner):
python -m dwellir_harvester.lib.run examples.plugins.sample_collector:SamplePluginCollector # or add a path for local plugins python -m dwellir_harvester.lib.run runner_plugin:RunnerPlugin --collector-path ./examples/plugins
Running as a Systemd Service
-
Install the service using the provided script:
sudo scripts/install-service.sh -
The service will be installed with default settings. You can customize it by editing:
sudo nano /etc/dwellir-harvester/config
-
Start and enable the service:
sudo systemctl start dwellir-harvester sudo systemctl enable dwellir-harvester
-
Check the status:
systemctl status dwellir-harvester
-
View logs:
journalctl -u dwellir-harvester -f
Available Collectors
host- Collects basic system information (default)null- A dummy collectordummychain- Collects data from a dummychain (snap install dummychain --edge)
API Endpoints
GET /metadata- Get the latest collected dataGET /healthz- Health check endpoint
Development
Setting Up for Development
-
Clone the repository and install development dependencies:
git clone https://github.com/your-org/dwellir-harvester.git cd dwellir-harvester python -m venv .venv source .venv/bin/activate pip install -e ".[dev]"
-
Run tests:
python -m pytest
-
Run linters:
black . isort . mypy .
Local/offline dev & test bench (with sibling lib)
If you keep the library as a sibling checkout and are working without internet access, use this flow:
- Create a virtualenv that can see system packages (for setuptools) and activate it:
python3 -m venv --system-site-packages .venv source .venv/bin/activate
- Install the library from the sibling path without build isolation (avoids downloading build deps):
pip install --no-build-isolation -e ../dwellir-harvester-lib
- Install this app in editable mode without pulling extra deps (already satisfied by the lib/env):
pip install --no-build-isolation --no-deps -e .
- Run the test bench:
python -m pytest
If setuptools is missing in the venv, recreate it with --system-site-packages as above. If your lib lives elsewhere, adjust the path in step 2 or edit pyproject.toml accordingly.
Adding a New Collector
Collectors now live in the dwellir-harvester-lib repo.
Add your collector to ../dwellir-harvester-lib/src/dwellir_harvester/collectors/, implement it with BlockchainCollector or GenericCollector, and export it via __all__ in that package (or ship it as a plugin via entry points/--collector-path).
License
MIT
HTTP API
The daemon exposes a small HTTP API on 0.0.0.0:<service.port> (default :18080).
Endpoints
GET /healthz→ plain text"ok"if the daemon is serving.GET /metadata→ the latest JSON document (200) or{"error":"metadata not found"}(404).
curl examples
# Health
curl -s http://127.0.0.1:18080/healthz
# Current metadata (pretty-print)
curl -s http://127.0.0.1:18080/metadata | jq .
If you changed the port via
service.port, replace18080in the examples.
Testing
python -m pip install -e .
pip install pytest
pytest -q
Build & publish (Python package)
# build
python3 -m pip install build twine
python3 -m build # creates dist/*.tar.gz and dist/*.whl
# upload to TestPyPI first (recommended)
python3 -m twine upload -r testpypi dist/*
# Install from testpypi
python3 -m venv .venv
source .venv/bin/activate
# Pull in deps from real, this is needed only on testpypi
pip3 install jsonschema>=4.25.1 psutil>=7.1.3 requests>=2.32.5
# Install from testpypi
pip3 install --index-url https://test.pypi.org/simple/ --no-deps dwellir-harvester
# then to PyPI
python3 -m twine upload dist/*
Notes
- The framework fills
metadata.last_collect_attempt_at,metadata.last_collect_status, andmetadata.last_successful_collect_atautomatically. - Collectors should only return the
blockchainandworkloadportions. - For recoverable issues, raise
CollectorPartialErrorwith warnings; the run will be markedpartialand still emit output. - For unrecoverable issues, raise
CollectorFailedError. - The
nullcollector is intended for smoke tests, CI, and “no local client” deployments.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dwellir_harvester-0.0.3.tar.gz.
File metadata
- Download URL: dwellir_harvester-0.0.3.tar.gz
- Upload date:
- Size: 14.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5ec6cdd0ee0a152e2eebe6af094ee7c52ff8e8e7b48e3d032f240b3276bca68
|
|
| MD5 |
e356b9d87ed5e9411d6039469e841f02
|
|
| BLAKE2b-256 |
46056a78d83bbd21de522aa7cc41a5a2dece27763db81f593faed72a470ed763
|
File details
Details for the file dwellir_harvester-0.0.3-py3-none-any.whl.
File metadata
- Download URL: dwellir_harvester-0.0.3-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d70a7c2c5276064640f71010d1c67f4a8aa6aeadae0d0562ee30e4722de5889c
|
|
| MD5 |
fad5cd863ede65dbc94a1ac200b094d6
|
|
| BLAKE2b-256 |
1ea3fc90ad0e547e45778d3bae05c2d0358071a014b0c8f2a5153bd83d924771
|