Add your description here
Project description
FDSNWS-Availability Deployment
Overview
- WFCatalog DB (gray) - database used by the WFCatalog collector and API.
- FDSNWS-Availability API (blue) - Flask-based FDSNWS-Availability implementation.
- FDSNWS-Availability Cache (green) - Redis-based cache to store restriction information.
- FDSNWS-Availability Cacher (orange) - Python-based container to harvest and store restriction information.
- FDSNWS-Availability Update (purple) - JS script to fill the
availabilitymaterialized view using WFCatalogdaily_streamsandc_segmentscollections.
Following implementation requires MongoDB v4.2 or higher.
Deployment
-
Clone the [https://github.com/EIDA/ws-availability] repository and go to its root
-
Copy
config.py.sampletoconfig.pyand adjust it as needed (please notice there are two sections -RUNMODE == "production"andRUNMODE == "test"; for Docker deployment use theproductionsection):# WFCatalog MongoDB MONGODB_HOST = "localhost" #MongoDB host MONGODB_PORT = 27017 #MongoDB port MONGODB_USR = "" #MongoDB user MONGODB_PWD = "" #MongoDB password MONGODB_NAME = "wfrepo" #MongoDB database name FDSNWS_STATION_URL = "https://orfeus-eu.org/fdsnws/station/1/query" #FDSNWS-Station endpoint to harvest restriction information from CACHE_HOST = "localhost" #Cache host CACHE_PORT = 6379 #Cache port CACHE_INVENTORY_KEY = "inventory" #Cache key for restriction information CACHE_INVENTORY_PERIOD = 0 #Cache invalidation period for `inventory` key; 0 = never invalidate CACHE_RESP_PERIOD = 1200 #Cache invalidation period for API response
-
Build the containers:
docker-compose -p 'fdsnws-availability' up -d --no-deps --build
When the Docker stack is deployed, you will see 3 containers running:
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4e3dace01fb0 fdsnws-availability_api "/bin/bash -c 'gunic…" 10 seconds ago Up 5 seconds 0.0.0.0:9001->9001/tcp fdsnws-availability-api 3c91e0d1c5e6 fdsnws-availability_cacher "/bin/bash -c 'pytho…" 10 seconds ago Up 5 seconds 0.0.0.0:11211->11211/tcp fdsnws-availability-cacher d983e64d64a8 redis:7.0-alpine "docker-entrypoint.s…" 10 seconds ago Up 5 seconds 0.0.0.0:6379->6379/tcp fdsnws-availability-cache
You can follow the
fdsnws-availability-cachercontainer to see the status of restriction information harvesting:$ docker logs --follow fdsnws-availability-cacher [2023-01-11 09:47:38 +0000] [0] [INFO] Getting inventory from FDSNWS-Station... [2023-01-11 09:47:39 +0000] [0] [INFO] Harvesting 33 from https://orfeus-eu.org/fdsnws/station/1/query?level=network: 2M,3T,6A... #... [2023-02-15 08:31:56 +0000] [0] [INFO] Completed caching inventory from FDSNWS-Station
Once
fdsnws-availability-cacheris completed, it will go down. Harvested information is stored in the Redis DB served byfdsnws-availability-cachecontainer. To rebuild the cache, simply restart the container using:docker start fdsnws-availability-cacher
To automate cache rebuilding process, add following line to
cron:# Rebuild FDSNWS-Availability restriction information cache daily at 3:00 AM 0 3 * * * docker restart fdsnws-availability-cacher
It will harvest and overwrite the restricted information stored in Redis instance.
-
Materialized view
-
Initial build
When the stack is initially deployed, the materialized view is not yet in place. To build it, issue the following command:
# Script started on 2023-02-24 $ mongosh -u USER -p PASSWORD --authenticationDatabase wfrepo --eval "daysBack=365" views/main.js Processing WFCatalog entries using networks: '^.*$', stations: '^.*$', start: '2022-03-24', end: '2023-03-24' completed!
It will go throught the documents in
daily_streamsandc_segmentsfrom last year, extract availability information and store it in theavailabilitymaterialized view. -
Daily appension
To automate availability information appension, add following line to
cron:0 6 * * * cd ~/ws-availability/views && mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo main.js > /dev/null 2>&1
It will go throught the documents in
daily_streamsandc_segmentsfrom last day, extract availability information and append it to theavailabilitymaterialized view. If additional parameters are not provided, script processes data from last day:# Script started on 2023-02-24 $ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo main.js Processing WFCatalog entries using networks: '^.*$', stations: '^.*$', start: '2023-03-23', end: '2023-03-24' completed!
-
Back-processing
Processing can be also executed on a predefined subset of data using
networks,stations,startandendparameters.# Last week $ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval "daysBack=7;" main.js Processing WFCatalog entries using networks: '^.*$', stations: '^.*$', start: '2023-03-17', end: '2023-03-24' completed! # January 2023 $ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval "start='2023-01-01'; end='2023-01-31'" main.js Processing WFCatalog entries using networks: '^.*$', stations: '^.*$', start: '2023-01-01', end: '2023-01-31' completed! # NL.HGN data between December 2022 and January 2023 $ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval "networks='NL'; stations='HGN'; start='2022-12-01'; end='2023-01-31'" main.js Processing WFCatalog entries using networks: '^NL$', stations: '^HGN$', start: '2022-12-01', end: '2023-01-31' completed! # You can also use regular expressiosn for `networks` and `stations` params # Please refer to [docs](https://www.mongodb.com/docs/manual/reference/operator/query/regex/) for details # Stations from NL network matching `G*4` template with timespan from 2023-03-01 till 2023-03-02 $ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval "networks='NL'; stations='G.*4'; start='2023-03-01'; end='2023-03-02'" main.js Processing WFCatalog entries using networks: '^NL$', stations: '^G.*4$', start: '2023-03-01', end: '2023-03-02' completed! # Stations from `NL` or `NA` networks with station codes `HGN` or `SABA` and timespan from 2023-03-01 till 2023-03-02 $ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval "networks='NL|NA'; stations='HGN|SABA'; start='2023-03-01'; end='2023-03-02'" main.js Processing WFCatalog entries using networks: '^NL|NA$', stations: '^HGN|SABA$', start: '2023-03-01', end: '2023-03-02' completed! # All stations from networks `NL` and `NA` with timespan from 2023-03-01 till 2023-03-02 $ mongosh -u jarek -p password123 --authenticationDatabase wfrepo --eval "networks='NL|NA'; start='2023-03-01'; end='2023-03-02'" main.js Processing WFCatalog entries using networks: '^NL|NA$', stations: '^.*$', start: '2023-03-01', end: '2023-03-02' completed!
-
Indexes
It is highly suggested to create at least following index in the
availabilitymaterialized view. First, login to your MongoDB instance usingmongoshand then execute following commands:use wfrepo; db.availability.createIndex({ net: 1, sta: 1, loc: 1, cha: 1, ts: 1, te: 1 })
-
-
Validation
Now it is time to check if everything is running (remember to change the
netquery parameter). API is exposed by default on port9001, let's try to get the landing page:$ curl "127.0.0.1:9001" <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8" /> <meta name="author" content="gempa GmbH" /> <title>FDSNWS-Availability</title> </head> <body> <h1>FDSNWS Availability Web Service</h1> <p> The availability web service returns detailed time span information about available time series data. Please refer to <a href="http://www.fdsn.org/webservices" >http://www.fdsn.org/webservice</a > for a complete service description. </p> <h2>Available URLs</h2> <ul> <li><a href="query">query</a></li> <li><a href="extent">extent</a></li> <li><a href="version">version</a></li> <li><a href="application.wadl">application.wadl</a></li> </ul> </body>
Get request to the
/extentmethod:$ curl "127.0.0.1:9001/extent?net=NA&start=2023-02-01" #Network Station Location Channel Quality SampleRate Earliest Latest Updated TimeSpans Restriction NA SABA BHE D 40.0 2023-02-01T00:00:00.000000Z 2023-02-14T00:00:00.000000Z 2023-02-14T07:41:14Z 1 OPEN NA SABA BHN D 40.0 2023-02-01T00:00:00.000000Z 2023-02-14T00:00:00.000000Z 2023-02-14T07:42:07Z 1 OPEN NA SABA BHZ D 40.0 2023-02-01T00:00:00.000000Z 2023-02-14T00:00:00.000000Z 2023-02-14T07:41:41Z 1 OPEN # ...
Get request to the
/querymethod:$ curl "127.0.0.1:9001/query?net=NA&start=2023-02-01" #Network Station Location Channel Quality SampleRate Earliest Latest NA SABA BHE D 40.0 2023-02-01T00:00:00.000000Z 2023-02-02T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-02T00:00:00.000000Z 2023-02-03T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-03T00:00:00.000000Z 2023-02-04T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-04T00:00:00.000000Z 2023-02-05T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-05T00:00:00.000000Z 2023-02-06T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-06T00:00:00.000000Z 2023-02-07T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-07T00:00:00.000000Z 2023-02-08T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-08T00:00:00.000000Z 2023-02-09T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-09T00:00:00.000000Z 2023-02-10T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-10T00:00:00.000000Z 2023-02-11T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-11T00:00:00.000000Z 2023-02-12T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-12T00:00:00.000000Z 2023-02-13T00:00:00.000000Z NA SABA BHE D 40.0 2023-02-13T00:00:00.000000Z 2023-02-14T00:00:00.000000Z
-
Reverse proxy example config
An example of Apache reverse proxy config:
# FDSNWS-Availability (Docker) <Location /fdsnws/availability/1> # in order to omit CORS error Header add Access-Control-Allow-Origin "*" </Location> ProxyPass /fdsnws/availability/1 <HOST>:9001 timeout=600 ProxyPassReverse /fdsnws/availability/1 <HOST>:9001 timeout=600
Performance Tuning
Gunicorn Workers Configuration
The number of Gunicorn workers directly affects how many concurrent requests your service can handle. The default configuration uses 1 worker for maximum stability on resource-constrained servers.
Current Configuration (docker-compose.yml)
command: gunicorn --bind 0.0.0.0:9001 --workers 1 start:app
Adjusting Worker Count
For servers with limited resources or thread creation issues:
# Minimum configuration (most stable)
command: gunicorn --bind 0.0.0.0:9001 --workers 1 --timeout 600 start:app
For servers with moderate resources:
# 2-3 workers (recommended for most deployments)
command: gunicorn --bind 0.0.0.0:9001 --workers 2 --timeout 600 start:app
For high-performance servers:
# Formula: (2 × CPU cores) + 1
# Example for 4-core server: --workers 9
command: gunicorn --bind 0.0.0.0:9001 --workers 4 --timeout 600 start:app
Important Notes
-
Each worker is a separate process with its own memory footprint
-
More workers ≠ always better - too many workers can exhaust system resources
-
Monitor for errors after increasing workers:
docker logs -f fdsnws-availability-api # Watch for "pthread_create failed" or similar errors
-
Resource usage check:
docker stats fdsnws-availability-api # If CPU < 80% and memory available, you can add more workers
MongoDB Connection Pool
The MongoDB connection pool is configured in apps/wfcatalog_client.py:
maxPoolSize=1 # Connections per worker
How It Works
- Each Gunicorn worker has its own MongoDB client
- Total connections =
workers × maxPoolSize - Example: 2 workers × 1 pool = 2 total MongoDB connections
When to Adjust
Keep maxPoolSize=1 if:
- ✅ Using sync workers (default Gunicorn configuration)
- ✅ Each worker handles one request at a time
- ✅ Server has resource constraints
Increase maxPoolSize only if:
- Using async workers (gevent/eventlet)
- Using threading within workers
- MongoDB is a bottleneck (check with profiling)
Example Configurations
| Workers | maxPoolSize | Total Connections | Use Case |
|---|---|---|---|
| 1 | 1 | 1 | Minimal (default) |
| 2 | 1 | 2 | Recommended |
| 4 | 1 | 4 | High performance |
| 2 | 5 | 10 | Async workers |
Thread Limiting (Important!)
The configuration includes thread limits to prevent pthread_create failed errors on restricted servers:
environment:
OPENBLAS_NUM_THREADS: 1
MKL_NUM_THREADS: 1
NUMEXPR_NUM_THREADS: 1
OMP_NUM_THREADS: 1
Do not remove these unless you're certain your server can handle multiple threads per process. These prevent NumPy/ObsPy from spawning excessive threads.
Troubleshooting
Problem: Service crashes with "pthread_create failed"
- Solution: Reduce workers to 1, keep thread limits in place
Problem: Slow response times under load
- Solution: Increase workers (if resources allow), monitor with
docker stats
Problem: High memory usage
- Solution: Reduce workers, check for memory leaks with profiling
Problem: MongoDB connection errors
- Solution: Check total connections (workers × maxPoolSize) against MongoDB limits
Performance Monitoring
See tests/performance/ for profiling and benchmarking tools:
# Quick performance test
bash tests/performance/quick_test.sh
# Detailed profiling
python tests/performance/profiler.py
# Load testing
locust -f tests/performance/locustfile.py --host=http://localhost:9001
For more details, see Performance Analysis Plan.
Running in development environment
-
Go to the root directory.
-
Copy
config.py.sampletoconfig.pyand adjust it as needed. -
Create the virtual environment:
python3 -m venv env
-
Activate the virtual environment:
source env/bin/activate
-
Install the dependencies:
pip install -r requirements.txt
-
Create Redis instance (mandatory for WFCatalog-based deployment):
docker run -p 6379:6379 --name cache -d redis:7.0-alpine redis-server --save 20 1 --loglevel warning
-
Build the cache:
python3 cache.py -
Now you can either:
-
Run it:
RUNMODE=test FLASK_APP=start.py flask run # Or with gunicorn: RUNMODE=test gunicorn --workers 2 --timeout 60 --bind 0.0.0.0:9001 start:app
-
Debug it in VS Code (F5) after selecting "Launch (Flask)" config.
-
RUNMODE builtin values
productiontest
Tests
Tests can be executed from the respository root using following command:
PYTHONPATH=./apps/ python3 -m unittest discover tests/
Ideas for improvements
- Move restriction information from Redis cache directly to the
db.availabilitymaterialized view. This would imply modifying theviews/main.jsscript with code harvesting this information directly from the FDSNWS-Station instance. - Modify underlying RESIF code from logic based on list of arrays to list of objects/dicts which is native MongoDB response to prevent the object/dict to array casting.
References
This repository has been forked from gitlab.com/resif/ws-availability, special thanks to our colleagues at RESIF for sharing their implementation of the FDSNWS-Availability web service. 💐
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ws_availability-1.0.4.tar.gz.
File metadata
- Download URL: ws_availability-1.0.4.tar.gz
- Upload date:
- Size: 55.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
871558b279af577343a20bfe8ecfba713e7059f0c1c4774d8b9ac28060e0a528
|
|
| MD5 |
30ce9a716ce7010186f3c185a1a5092e
|
|
| BLAKE2b-256 |
7ad3af5cca602110c8830d3188f9648307f862c15f2ed2a1819c2289c15a9e8c
|
Provenance
The following attestation bundles were made for ws_availability-1.0.4.tar.gz:
Publisher:
publish.yml on EIDA/ws-availability
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ws_availability-1.0.4.tar.gz -
Subject digest:
871558b279af577343a20bfe8ecfba713e7059f0c1c4774d8b9ac28060e0a528 - Sigstore transparency entry: 947400958
- Sigstore integration time:
-
Permalink:
EIDA/ws-availability@356b9efc8b0a38af19b36e69d10947fecba52197 -
Branch / Tag:
refs/tags/v1.0.4 - Owner: https://github.com/EIDA
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@356b9efc8b0a38af19b36e69d10947fecba52197 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ws_availability-1.0.4-py3-none-any.whl.
File metadata
- Download URL: ws_availability-1.0.4-py3-none-any.whl
- Upload date:
- Size: 40.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5fc3d9f21f7c965ddc75fc4df7ca8da30f0b42a78b9007e6204b50cccb1736f0
|
|
| MD5 |
0da7b8722899564ca3d18092c695b9ad
|
|
| BLAKE2b-256 |
eef66532407f26fe3451c0257f1b66d30a1524b093d9ef560ceb74a0d1f50238
|
Provenance
The following attestation bundles were made for ws_availability-1.0.4-py3-none-any.whl:
Publisher:
publish.yml on EIDA/ws-availability
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ws_availability-1.0.4-py3-none-any.whl -
Subject digest:
5fc3d9f21f7c965ddc75fc4df7ca8da30f0b42a78b9007e6204b50cccb1736f0 - Sigstore transparency entry: 947401002
- Sigstore integration time:
-
Permalink:
EIDA/ws-availability@356b9efc8b0a38af19b36e69d10947fecba52197 -
Branch / Tag:
refs/tags/v1.0.4 - Owner: https://github.com/EIDA
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@356b9efc8b0a38af19b36e69d10947fecba52197 -
Trigger Event:
push
-
Statement type: