Skip to main content

Demographic mapping based on UK ONS & census data.

Project description

demography

This package implements a simple mechanism for quickly loading demographic data based on post codes. This is currently only implemented for the UK. It is based on data made available by the UK's Office for National Statistics (ONS).

The data was taken from Geoportal.

If you want to jump to seeing how this package can play with pandas, see below.

Getting started

You can install demography with:

pip install demography

There's only really one main function in this package, and it works like this:

import demography

demography.get("SW1A 0AA", using="groups")

You'll get something like:

['Cosmopolitans', 'Aspiring and affluent', 'Highly-qualified quaternary workers']

These are Classification for Output Areas (OAC) groups -- demographic groupings provided by ONS for specific regions. If a specific OAC group cannot be found from the full postcode, it will default to using the prefix value (i.e. area-level demographics). If this too does not return a value, it will return the value provided by the default parameter.

You can also get the group codes:

demography.get("SW1A 0AA", using="oac")

And you'd get:

2D2

If you want to access the mappings between OAC codes and the groups together, you can use:

demography.groups("uk")

To give:

{'1A1': ['Rural residents', 'Farming communities', 'Rural workers and families'], '1A2': ['Rural residents', 'Farming communities', 'Established farming communities'] ...

Finally, it can be useful to have these groups encoded with:

demography.get("SW1A 0AA", using="encoded_groups")

To give:

[30, 55, 59]

To retrieve the encodings for this, you can use:

demography.encoded_groups("uk")

Validation

As an additional benefit, you can enable validation for postcodes with:

demography.get("SW1A 0AA", using="encoded_groups", validate=True)

Playing with pandas

You can use demography to encode pandas.DataFrame columns pretty easily too:

import pandas as pd
import demography as dm

df = pd.read_csv("my-dataset.csv")

# get the encoded 'super group', 'group', 'sub group' set. 
data_gen = (dm.get(code, using="encoded_groups") for code in df["postcode"])

# build a dataframe
dm_df = pd.DataFrame(data=data_gen, columns=["super_group", "group", "sub_group"])

# horizontally concatenate the groups dataframe to your original frame.
df = pd.concat([df, dm_df], axis=1)

Or alternatively, if you only need oac11 codes, you can use:

df["demographic"] = df["postcode"].apply(lambda _: dm.get(_))

Note that you'll need to use the name of your column for postcode!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

demography-0.0.2.post2.tar.gz (7.2 MB view details)

Uploaded Source

Built Distribution

demography-0.0.2.post2-py3-none-any.whl (7.2 MB view details)

Uploaded Python 3

File details

Details for the file demography-0.0.2.post2.tar.gz.

File metadata

  • Download URL: demography-0.0.2.post2.tar.gz
  • Upload date:
  • Size: 7.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.4

File hashes

Hashes for demography-0.0.2.post2.tar.gz
Algorithm Hash digest
SHA256 f4fbfd97b4a2f4074851ed16cda2086548515de7716980764f460d1f1bd6f894
MD5 94a62eb8239fbfbefe7e092cda1de28b
BLAKE2b-256 6451d64d96bb11b0342bf5f4a1bc56a2c412523907769d286abda197662ff345

See more details on using hashes here.

File details

Details for the file demography-0.0.2.post2-py3-none-any.whl.

File metadata

  • Download URL: demography-0.0.2.post2-py3-none-any.whl
  • Upload date:
  • Size: 7.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.4

File hashes

Hashes for demography-0.0.2.post2-py3-none-any.whl
Algorithm Hash digest
SHA256 c0884dba4550dc14c33f42f214eeb171fd1a3abbe7f8fcdbcbbbf01fa47139c3
MD5 e1198620d7866b549e33e1c892807af8
BLAKE2b-256 7295c26497e2ada9504caad07218a3e84304c55c1a3d44cbd9a75d96504eca0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page