Skip to main content

Pandas reader for the BUFR format using ecCodes.

Project description

Pandas reader for the BUFR format using ecCodes.

Features with development status Alpha:

  • extracts observations from a BUFR file as a Pandas DataFrame,

  • reads BUFR 3 and 4 files with uncompressed and compressed subsets,

  • supports all modern versions of Python 3.7, 3.6, 3.5 and PyPy3,

  • works on Linux, MacOS and Windows, the ecCodes C-library is the only binary dependency,

  • PyPI package with no install time build (binds via CFFI ABI mode).

Limitations:

  • no special handling of nodata values, yet,

  • no conda-forge package (yet),

  • filters only match exact values.

Installation

The easiest way to install pdbufr binary dependencies is via Conda:

$ conda install -c conda-forge eccodes

and pdbufr itself as a Python package from PyPI with:

$ pip install pdbufr

System dependencies

The Python module depends on the ECMWF ecCodes library that must be installed on the system and accessible as a shared library. Some Linux distributions ship a binary version that may be installed with the standard package manager. On Ubuntu 18.04 use the command:

$ sudo apt-get install libeccodes0

On a MacOS with HomeBrew use:

$ brew install eccodes

As an alternative you may install the official source distribution by following the instructions at https://software.ecmwf.int/wiki/display/ECC/ecCodes+installation

You may run a simple selfcheck command to ensure that your system is set up correctly:

$ python -m pdbufr selfcheck
Found: ecCodes v2.13.1.
Your system is ready.

Usage

First, you need a well-formed BUFR file, if you don’t have one at hand you can download our sample file:

$ wget http://download.ecmwf.int/test-data/metview/gallery/temp.bufr

You can explore the file with ecCodes command line tools bufr_ls and bufr_dump to understand the structure and the keys/values you can use to select the observations you are interested in.

The pdbufr.read_bufr function return a pandas.DataDrame with the requested columns. It accepts query filters on the BUFR message header that are very fast and query filters on the observation keys. Filters match on a single value or on one value in a list and the are always in logical and:

>>> import pdbufr
>>> df_all = pdbufr.read_bufr('temp.bufr', columns=('stationNumber', 'latitude', 'longitude'))
>>> df_all.head()
   stationNumber  latitude  longitude
0            907     58.47     -78.08
1            823     53.75     -73.67
2              9    -90.00       0.00
3            486     18.43     -69.88
4            165     21.98    -159.33

>>> df_one = pdbufr.read_bufr(
...     'temp.bufr',
...     columns=('stationNumber', 'latitude', 'longitude'),
...     filters={'stationNumber': 907},
... )
>>> df_one.head()
   stationNumber  latitude  longitude
0            907     58.47     -78.08

>>> df_two = pdbufr.read_bufr(
...     'temp.bufr',
...     columns=('stationNumber', 'latitude', 'longitude', 'data_datetime', 'pressure', 'airTemperature'),
...     filters={'stationNumber': [823, 9]},
... )

>>> df_two.head()
   stationNumber  latitude  longitude  pressure  airTemperature       data_datetime
0            823     53.75     -73.67  100000.0  -1.000000e+100 2008-12-08 12:00:00
1            823     53.75     -73.67   97400.0    2.567000e+02 2008-12-08 12:00:00
2            823     53.75     -73.67   93700.0    2.551000e+02 2008-12-08 12:00:00
3            823     53.75     -73.67   92500.0    2.553000e+02 2008-12-08 12:00:00
4            823     53.75     -73.67   90600.0    2.567000e+02 2008-12-08 12:00:00

>>> df_two.tail()
     stationNumber  latitude  longitude  pressure  airTemperature       data_datetime
190              9     51.77      36.17    2990.0  -1.000000e+100 2008-12-08 12:00:00
191              9     51.77      36.17    2790.0    2.063000e+02 2008-12-08 12:00:00
192              9     51.77      36.17    2170.0  -1.000000e+100 2008-12-08 12:00:00
193              9     51.77      36.17    2000.0    2.031000e+02 2008-12-08 12:00:00
194              9     51.77      36.17    1390.0    1.979000e+02 2008-12-08 12:00:00

Contributing

The main repository is hosted on GitHub, testing, bug reports and contributions are highly welcomed and appreciated:

https://github.com/ecmwf/pdbufr

Please see the CONTRIBUTING.rst document for the best way to help.

Lead developer:

Main contributors:

See also the list of contributors who participated in this project.

License

Copyright 2019 European Centre for Medium-Range Weather Forecasts (ECMWF).

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdbufr-0.8.0.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

pdbufr-0.8.0-py2.py3-none-any.whl (11.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pdbufr-0.8.0.tar.gz.

File metadata

  • Download URL: pdbufr-0.8.0.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.4

File hashes

Hashes for pdbufr-0.8.0.tar.gz
Algorithm Hash digest
SHA256 9ced13b4ddf54f80298fa3fc378e1f1ecea890a0968531f604878c0ec63db8f4
MD5 f94a1160ccb171f08bc1df9e12d59f5d
BLAKE2b-256 e3f1467c5586c931f14dd84a8096d1ec33e5656d7bdb69143eff899c01131d24

See more details on using hashes here.

Provenance

File details

Details for the file pdbufr-0.8.0-py2.py3-none-any.whl.

File metadata

  • Download URL: pdbufr-0.8.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.4

File hashes

Hashes for pdbufr-0.8.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 46699eefdf63abe7134d8f6b9d25e322a028c30e058985b3e5648925dbfe8995
MD5 d95c36a7b03f935606dadf2de9ee2d47
BLAKE2b-256 ab1e2aa1cb976145ff7d60a18faace6d03ff828134c1369558441cb1c3838171

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page