Skip to main content

A toolkit for making dot-density maps in Python

Project description

dorchester

PyPI Changelog Tests License

A tool for making dot-density maps in Python.

Caveat emptor

This is very alpha right now. Use at your own risk and evaluate any editorial usage of this of this library before publishing.

Installation

Install this tool using pip:

$ pip install dorchester

Usage

The main command is dorchester plot. That takes an input file, an output file and one or more property keys to extract population counts.

dorchester plot --help
Usage: dorchester plot [OPTIONS] SOURCE DEST

  Generate data for a dot-density map. Input may be any GIS format readable
  by Fiona (Shapefile, GeoJSON, etc).

Options:
  -k, --key TEXT                  Property name for a population. Use multiple
                                  to map different population classes.

  -f, --format [csv|geojson|null]
                                  Output format. If not given, will guess
                                  based on output file extension.

  -m, --mode [w|a|x]              File mode for destination  [default: w]
  --fid TEXT                      Use a property key (instead of feature.id)
                                  to uniquely identify each feature

  --coerce                        Coerce properties passed in --key to
                                  integers. BE CAREFUL. This could cause
                                  incorrect results if misused.

  --progress                      Show a progress bar  [default: False]
  -m, --multiprocessing           Use multiprocessing
  --help                          Show this message and exit.

Input can be in any format readable by Fiona, such as Shapefiles and GeoJSON. The input file needs to contain both population data and boundaries. You may need to join different files together before plotting with dorchester.

Output format (--format) can be CSV or GeoJSON (more formats coming soon). For GeoJSON, the output will be a stream of newline-delimited Point features, like this:

{"type": "Feature", "geometry": {"type": "Point", "coordinates": [76, 38]}, "properties": {"group": "population", "fid": 1}}
{"type": "Feature", "geometry": {"type": "Point", "coordinates": [77, 39]}, "properties": {"group": "population", "fid": 1}}
{"type": "Feature", "geometry": {"type": "Point", "coordinates": [78, 37]}, "properties": {"group": "population", "fid": 1}}

This will be big files, because we are creating a point for every individual. Massachusetts, for example, had a population of 6.631 million in 2010, which means a dot density CSV file will be 6,336,107 lines long and 305 mb.

Each key (--key) should correspond to a property on each feature whose value is a whole number. In a block like this, use --key POP10 to extract population:

{
  "geometry": {
    "coordinates": [...],
    "type": "Polygon"
  },
  "id": "0",
  "properties": {
    "BLOCKCE": "4023",
    "BLOCKID10": "250010112004023",
    "COUNTYFP10": "001",
    "HOUSING10": 16,
    "PARTFLG": "N",
    "POP10": 12,
    "STATEFP10": "25",
    "TRACTCE10": "011200"
  },
  "type": "Feature"
}

You can pass multiple --key options to create different groups that will be layered together. This is how you would create a map showing different racial groups, for example.

The --mode option controls how the output file is opened:

  • w will create or overwrite the output file
  • a will append to an existing file
  • x will try to create a new file and fail if that file already exists

Setting --fid will use a property key to identify each feature, instead of the feature's id field (which is often missing, or will be an index number in shapefiles). In the Census block example above, BLOCKID10 will uniquely identify this block, while id: 0 only identifies it as the first feature in its source shapefile.

For data sources where properties are encoded as strings, the --coerce option will recast anything passed via --key to integers. Be careful with this option, as it involves changing data. It will fail (and stop plotting) if it encounters something that can't be coerced into an integer.

Use the --progress flag to show a progress bar. This is off by default.

Use -m or --multiprocessing to use Python's multiprocessing module to significantly speed up point generation. This will try to use every processor on your machine instead of just one.

Putting points on a map

For small-ish areas, QGIS will render lots of points just fine. Generate points, and load the output as a delimited or GeoJSON file.

To build an interactive dot density map, you can use tippecanoe to generate an MBTiles file, which can be uploaded to Mapbox (or possibly other hosting providers). This has worked for me:

tippecanoe -zg -o points.mbtiles --drop-densest-as-needed --extend-zooms-if-still-dropping points.csv

About the name

Dorchester is the largest and most diverse neighborhood in Boston, Massachusetts, and is often referred to as Dot.

The name is also a nod to Englewood, built by the Chicago Tribune News Apps team. This is, hopefully, a worthy successor.

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd dorchester
python -m venv .venv
source .venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and tests:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dorchester-0.6.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

dorchester-0.6.0-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file dorchester-0.6.0.tar.gz.

File metadata

  • Download URL: dorchester-0.6.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for dorchester-0.6.0.tar.gz
Algorithm Hash digest
SHA256 91dee3dab6d0db3a021cfd09fd91197fc26b4583a52301e08bdc416cc77bba0c
MD5 3a2507e2283bcb46606cafd938074a88
BLAKE2b-256 cd9ba091faf9166ddff4b17448a6ab8f67b61e9b7a3b5cf9058d92f5ff37b6ae

See more details on using hashes here.

File details

Details for the file dorchester-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: dorchester-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for dorchester-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1cdc174b8eda6616d7845d6c23cb186aa5215628e4c522127b21458d22f55c81
MD5 9874f229bdc42444f87b0f0a4d95a58a
BLAKE2b-256 6f83d366aa1f1e1246a8d7469f081b00e65324212a4c0b3394122cf4b50fe84f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page