Skip to main content

GIS functions used at Statistics Norway.

Project description

ssb-sgis

GIS Python tools used in Statistics Norway.

See documentation here.

PyPI Status Python Version License

Documentation Tests Coverage Quality Gate Status

pre-commit Black Ruff Poetry

sgis builds on the geopandas package and provides functions that make it easier to do GIS in python. Features include network analysis, functions for exploring multiple GeoDataFrames in a layered interactive map, and vector operations like finding k-nearest neighbours, splitting lines by points, snapping and closing holes in polygons by size.

To install, use one of:

poetry add ssb-sgis
pip install ssb-sgis

The sgis package has the following optional dependencies:

  • bucket: For working with files stored in GCS buckets
  • torch: Use functionality from PyTorch and torchgeo
  • xarray: Use functionality from xarray and rioxarray
  • test: Packages needed for running pytest
  • all: All optional dependencies

The optional dependencies can be installed by adding them in brackets when installing, like this:

poetry add ssb-sgis[all]
pip install ssb-sgis[all]

Network analysis examples

Preparing for network analysis:

import sgis as sg


roads = sg.read_parquet_url(
    "https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet"
)

connected_roads = sg.get_connected_components(roads).loc[lambda x: x["connected"] == 1]

directed_roads = sg.make_directed_network_norway(
    connected_roads,
    dropnegative=True,
)

rules = sg.NetworkAnalysisRules(directed=True, weight="minutes")

nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules)

nwa
NetworkAnalysis(
    network=Network(6364 km, percent_bidirectional=87),
    rules=NetworkAnalysisRules(weight=minutes, directed=True, search_tolerance=250, search_factor=0, split_lines=False, ...),
    log=True, detailed_log=False,
)

Fast many-to-many travel times/distances.

points = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet")
od = nwa.od_cost_matrix(points, points)

print(od)
        origin  destination    minutes
0            0            0   0.000000
1            0            1  13.039830
2            0            2  10.902453
3            0            3   8.297021
4            0            4  14.742294
...        ...          ...        ...
999995     999          995  11.038673
999996     999          996  17.820664
999997     999          997  10.288465
999998     999          998  14.798257
999999     999          999   0.000000

[1000000 rows x 3 columns]

Get number of times each line segment was visited, with optional weighting.

origins = points.iloc[:100]
destinations = points.iloc[100:200]

# creating uniform weights of 10
od_pairs = pd.MultiIndex.from_product([origins.index, destinations.index])
weights = pd.DataFrame(index=od_pairs)
weights["weight"] = 10

frequencies = nwa.get_route_frequencies(origins, destinations, weight_df=weights)

# plot the results
m = sg.ThematicMap(
    sg.buff(frequencies, 15),
    column="frequency",
    black=True,
    cmap="plasma",
    title="Number of times each road was used,\nweighted * 10",
)
m.plot()

png

Get the area that can be reached within one or more breaks.

service_areas = nwa.service_area(
    points.iloc[[0]],
    breaks=np.arange(1, 11),
)

# plot the results
m = sg.ThematicMap(
    service_areas,
    column="minutes",
    black=True,
    size=10,
    k=10,
    title="Roads that can be reached within 1 to 10 minutes",
)
m.plot()

png

Get one or more route per origin-destination pair.

routes = nwa.get_k_routes(
    points.iloc[[0]], points.iloc[[1]], k=4, drop_middle_percent=50
)

m = sg.ThematicMap(
    sg.buff(routes, 15),
    column="k",
    black=True,
    title="Four fastest routes from A to B",
    legend_kwargs=dict(
        title="Rank",
    ),
)
m.plot()

png

More network analysis examples can be found here: https://github.com/statisticsnorway/ssb-sgis/blob/main/docs/network_analysis_demo_template.md

Road data for Norway can be downloaded here: https://kartkatalog.geonorge.no/metadata/nvdb-ruteplan-nettverksdatasett/8d0f9066-34f9-4423-be12-8e8523089313

Developer information

Git LFS

The data in the testdata directory is stored with Git LFS. Make sure git-lfs is installed and that you have run the command git lfs install at least once. You only need to run this once per user account.

Dependencies

Poetry is used for dependency management. Install poetry and run the command below from the root directory to install the dependencies.

poetry install -E test --no-root

Tests

Use the following command from the root directory to run the tests:

poetry run pytest  # from root directory

Jupyter Notebooks

The files ending with _ipynb.py in the tests directory are jupyter notebooks stored as plain python files, using jupytext. To open them as Jupyter notebooks, right-click on them in JupyterLab and select Open With → Notebook.

When testing locally, start JupyterLab with this command:

poetry run jupter lab

For VS Code there are extensions for opening a python script as Jupyter Notebook, for example: Jupytext for Notebooks.

Code quality

Run 'ruff' on all files with safe fixes:

poetry run ruff check --fix .

Formatting

Format the code with black and isort by running the following command from the root directory:

poetry run black .
poetry run isort .

Pre-commit hooks

We are using pre-commit hooks to make sure the code is correctly formatted and consistent before committing. Use the following command from the root directory in the repo to install the pre-commit hooks:

poetry run pre-commit install

It then checks the changed files before committing. You can run the pre-commit checks on all files by using this command:

poetry run pre-commit run --all-files

Documentation

To generate the API-documentation locally, run the following command from the root directory:

poetry run sphinx-build -W docs docs/_build

Then open the file docs/_build/index.html.

To check and run the docstrings examples, run this command:

poetry run xdoctest --command=all ./src/sgis

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the MIT license, SSB sgis is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from Statistics Norway's SSB PyPI Template.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssb_sgis-1.3.10.tar.gz (8.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssb_sgis-1.3.10-py3-none-any.whl (8.9 MB view details)

Uploaded Python 3

File details

Details for the file ssb_sgis-1.3.10.tar.gz.

File metadata

  • Download URL: ssb_sgis-1.3.10.tar.gz
  • Upload date:
  • Size: 8.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.13

File hashes

Hashes for ssb_sgis-1.3.10.tar.gz
Algorithm Hash digest
SHA256 4c2366a90ec600c090caa5a04655c81b27d9b62d5dd811b9bb225b6ce6ce7684
MD5 0e90800148cc8835580a3acde4363986
BLAKE2b-256 66b9b44c8b0bc7f986738c86499106b3172e9c61a007fc25296c77bec3b2c85b

See more details on using hashes here.

File details

Details for the file ssb_sgis-1.3.10-py3-none-any.whl.

File metadata

  • Download URL: ssb_sgis-1.3.10-py3-none-any.whl
  • Upload date:
  • Size: 8.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.13

File hashes

Hashes for ssb_sgis-1.3.10-py3-none-any.whl
Algorithm Hash digest
SHA256 06250781efecf3749992f4441cb8e4c608aeac3786415a661c5715a9ad1c1d5c
MD5 678a87989e0f72e1e01340be988b3688
BLAKE2b-256 d00ec0943908318b14cab82d72841eeb770e5f8568f5916fbbbfbadb5581a3c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page