Skip to main content

Type stubs and utilities for PDAL (Point Data Abstraction Library) and USGS 3DEP lidar download.

Project description

pdal-piper

Type stubs and utilities PDAL (Point Data Abstraction Library) and USGS 3DEP.

Overview

Type stubs for PDAL: Adds support for IntelliSense and inline documentation to the pdal-python package for better IDE support when developing pdal pipelines in PyCharm, VSCode, etc. After installing pdal-piper, your IDE should be able to recognize objects like pdal.Reader.copc() and provide descriptions of the input parameters.

USGS 3DEP utilities: Search the current USGS 3DEP airborne lidar catalog and find URLs for Entwine Point Tiles that overlap a search area.

Parallel processing: Utilities for slicing search areas into tiles and processing them in parallel.

*Note, the original version of this package included its own data structures for pipelines and stages. The revised version is designed for built-in pdal-python pipeline and stage objects.

Installation

Basic Install:

conda install -c conda-forge pdal pdal-python gdal geopandas 
pip install pdal-piper
pdal-piper-setup # this installs the pipeline.pyi file, see below

It is strongly recommended that you make use of Conda’s environment management system and install PDAL in a separate environment (i.e., not the base environment). Instructions can be found on the Conda website.

Intellisense and inline documentation support is enabled by inserting the file pipeline.pyi into the pdal-python install directory (see "stub files", PEP 484). The pipeline.pyi file will be generated and inserted automatically by running pdal-piper-setup, otherwise the file will be created on first import of pdal-piper. If needed, you can regenerate pipeline.pyi by running pdal-piper-setup or pdal_piper.skeletons.generate_skeletons(). You may also need to restart your IDE and/or regenerate indexes or clear the local cache.

Example

In this example, we will find public lidar data on an online server, download data, clean it, canopy height statistics, and write files locally.

Find point cloud data

First we need to get some data to work with. I will show one method to pull data from an online server. First, we must define an area of interest using a bounding box [xmin, ymin, xmax, ymax].

In the first cell, I demonstrate how you can extract a bounding box from an interactive map using ipyleaflet (conda install ipyleaflet). Alternatively, you can skip this step and input a bounding box manually.

import ipyleaflet
import numpy as np

basemap = ipyleaflet.TileLayer(url='https://services.arcgisonline.com/arcgis/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}')
m = ipyleaflet.Map(center=[39, -100], zoom=5, scroll_wheel_zoom=True, basemap=basemap)
m.add(ipyleaflet.WMSLayer(url='https://index.nationalmap.gov:443/arcgis/services/3DEPElevationIndex/MapServer/WmsServer?',
                          layers='23',opacity=.5,name='USGS 3DEP overlay'))
m.add(ipyleaflet.LayersControl())
bbox = None
def handle_draw(target, action, geo_json):
    global bbox
    coords = geo_json['geometry']['coordinates'][0]
    bbox = [coords[0][0], coords[0][1], coords[2][0], coords[2][1]]
draw_control = ipyleaflet.DrawControl(rectangle={'shapeOptions': {'color': '#0000FF'}},
    polyline={}, polygon={}, circle={}, circlemarker={}, marker={}
)
draw_control.on_draw(handle_draw)
m.add_control(draw_control)
m

example_interactive_map.png

# Print bounding box selected in interactive map
bbox

# If you want manually input a bounding box, uncomment the line below and edit the values
#bbox = [-111.676326, 35.316211, -111.671391, 35.320098]

Next, we can search the USGS 3DEP catalog to find publicly available point clouds that overlap our area of interest using pdal_piper.USGS_3dep_Finder. USGS 3DEP is stored in Entwine Point Tile (.ept) format which means we can efficiently download small segments of the point cloud using a url.

import pdal_piper
finder = pdal_piper.USGS_3dep_Finder()
finder.search_3dep(bbox,'EPSG:4326')
finder.search_result
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
name id pct_coverage pts_per_m2 count total_area_ha url geometry
120 AZ_Coconino_B1_2019 120 100.0 15.372670 55223690056 359232.920560 https://s3-us-west-2.amazonaws.com/usgs-lidar-... POLYGON ((-111.67633 35.3201, -111.67139 35.32...
1253 USGS_LPC_AZ_VerdeKaibab_B2_2018_LAS_2019 1253 100.0 5.324541 35728383864 671013.439139 https://s3-us-west-2.amazonaws.com/usgs-lidar-... POLYGON ((-111.67633 35.3201, -111.67139 35.32...
# Here we select the URL for the dataset in the first row. 
# Alternatively, we could use a loop and download all of the available datasets.
url = finder.select_url(0)
url
'https://s3-us-west-2.amazonaws.com/usgs-lidar-public/AZ_Coconino_B1_2019/ept.json'

Define tile set

To improve computational efficiency and scalability, we can divide our area of interest into a set of tiles using a Tiler object. We specify the total extent of the tileset and the size of each tile. Notice, our extents are defined by geographic coordinates (degrees lat/lon) but we defined the tile size in meters, therefore, we set convert_units=True. get_tiles() gives us some options to format the tiles. We select the first tile from the upper left corner as a test.

tiler = pdal_piper.Tiler(extents = bbox, tile_size=100, buffer=0, convert_units=True, crs='EPSG:4326')
tile_bounds = tiler.get_tiles(format_as_pdal_str=True,flatten=False)
print(type(tile_bounds))
print(tile_bounds.shape)
<class 'numpy.ndarray'>
(4, 4)
first_tile_bounds = tile_bounds[0,0]
first_tile_bounds
'([-111.676326, -111.67522509346652], [35.3191979991, 35.320098], [-9999, 9999])/EPSG:4326'

Define processing pipeline

We need to create a processing pipeline that defines all actions we want PDAL to execute. Each action in the pipeline is described by a 'stage'. In other workflows, the stages are combined in a json-like object, stored as a text file, and run through PDAL via the command line interface. In contrast, pdal_piper makes the experience more Pythonic by providing a Python class with built-in documentation for each stage. We use these classes to define each stage, then combine the stages in a list, then pass the list into a Piper object. The Piper object will format the json text and pass it to PDAL for execution.

import pdal
import numpy as np

# Define processing pipeline for the first tile
pipelines = []

for xi, yi in np.ndindex(tile_bounds.shape):
    stages = [
        pdal.Reader.ept(filename=url, bounds=tile_bounds[xi, yi]),
        pdal.Filter.outlier(method='statistical',mean_k=12,multiplier=2.2),
        pdal.Filter.range(limits='Classification[0:6]'),
        pdal.Filter.hag_delaunay(),
        pdal.Writer.copc(filename=f'test_data/points_{xi}_{yi}.laz', extra_dims='all'),
        pdal.Writer.gdal(filename='test_data/canopy_metrics.tif', resolution=1,
                         dimension='HeightAboveGround', output_type='all', binmode=True)
    ]
    pipelines.append(pdal.Pipeline(stages))

# View pipeline for first tile in json formatting
pipelines[0].toJSON()
'[{"type": "readers.ept", "bounds": "([-111.676326, -111.67522509346652], [35.3191979991, 35.320098], [-9999, 9999])/EPSG:4326", "filename": "https://s3-us-west-2.amazonaws.com/usgs-lidar-public/AZ_Coconino_B1_2019/ept.json", "tag": "readers_ept1"}, {"type": "filters.outlier", "method": "statistical", "mean_k": 12, "multiplier": 2.2, "tag": "filters_outlier1"}, {"type": "filters.range", "limits": "Classification[0:6]", "tag": "filters_range1"}, {"type": "filters.hag_delaunay", "tag": "filters_hag_delaunay1"}, {"type": "writers.copc", "extra_dims": "all", "filename": "test_data/points_0_0.laz", "tag": "writers_copc1"}, {"type": "writers.gdal", "resolution": 1, "dimension": "HeightAboveGround", "output_type": "all", "binmode": true, "filename": "test_data/canopy_metrics.tif", "tag": "writers_gdal1"}]'
# Execute pipeline for first tile as a test
pipelines[0].execute()
pipelines[0].log
# If the log is empty, that is good. Otherwise, errors will show up in the log.
''

Lastly, we can run the pipeline on all files in the tile set. Tile bounds in the reader stage will automatically be assigned from the unique tile bounds. File names in the writer stages will automatically be assigned a unique value by inserting tile indices between the file basename and the file extension. Pipelines are executed in parallel processes.

Note, if Tiler.buffer>0 and the filters_crop stage is used in the pipeline, the filter will automatically use the buffered tile extents in the reader and the unbuffered tile extents in the crop filter. In this special case, the CRS of the Tiler must match the CRS of the point cloud.

# Execute pipeline for all tiles
logs = pdal_piper.execute_pipelines_parallel(pipelines)
[log for log in logs if log != '']
[]

From here, additional analysis can be carried out with your software of choice.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdal_piper-0.2.3.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdal_piper-0.2.3-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file pdal_piper-0.2.3.tar.gz.

File metadata

  • Download URL: pdal_piper-0.2.3.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for pdal_piper-0.2.3.tar.gz
Algorithm Hash digest
SHA256 7aa83f5f57e75481ddd194d70b2caff5abc0da6512dd6d924db896ee42cf5788
MD5 bef917a1728389b6a5d5746d6067dac6
BLAKE2b-256 5574c6927243cba09c1d56a7bcb3eb3821b17ea2b016e7627409675b6c6fb7cc

See more details on using hashes here.

File details

Details for the file pdal_piper-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: pdal_piper-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for pdal_piper-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1a0bd4edd033de7cdf90a5011d927db1d4d6cdc6908d3308108d0eae30946dcb
MD5 4373afcd923b98d27717004f1a4e49d5
BLAKE2b-256 1ee1858468887b732897b320710f336d611d01587a3da24061f011dcd48d9dff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page