Skip to main content

This is a data harvester for the Department for Work and Pensions Stat-Explore

Project description

DWP Harvester

This is a data harvester for the Department for Work and Pensions (DWP) Stat-Explore system. See Stat-Xplore : Open Data API documentation. This API is based on the SuperSTAR 9.5 Open Data API by WingArc1st.

Installation

To install from the Python Package Index (PyPi):

pip install statxplore

Authentication

The harvester authenticates against the remote server using an access token.

To use the harvester you must register an account on Stat-Xplore. When you're logged in, click the menu button (the three dots in the top-right corner) and click "Account". The string under "Open Data API Access" which is your API access key/token.

This token must be input to the harvester either in the command line or from a file (the default path is ~/configs/stat_explore.txt) as shown below:

statxplore --api_key <my_access_token>
# or
statxplore --api_key_path ~/configs/stat_explore.txt

Usage

To view the available commands and options, run the following command:

statxplore --help

Querying

To get a query specification in JSON format, visit Stat-Xplore, log in and select a data set. Choose a table and click "Open Table." You may customise the rows and columns as needed. Next, click the "Download Table" field and select "Open Data API Query (.json)" This JSON file can be used to define a query as shown below.

Run the following command to execute a query and output the result to a CSV file:

statxplore -o test.csv -q queries\relative-low-income-by-year-sheffield.json

Where -o (--output) is the output CSV file path, -q (--query) is the query JSON file.

To generate the CSV headers that will result from a particular query, use the --csv (-c) flag:

statxplore -o test.csv -q queries\relative-low-income-by-year-sheffield.json -c

Code documentation

An authenticated HTTP session is required to communicate with the API.

from statxplore import http_session

session = http_session.StatSession(api_key='<access_token>')

API objects

The subclasses of objects.StatObject are thin wrappers around the API endpoints. Please refer to the API documentation.

Schema

The [/schema endpoint](/schema endpoint) returns information about the Stat-Xplore datasets that are available to you, and their fields and measures.

The root endpoint, /schema, returns details of all datasets and folders at the root level of Stat-Xplore.

from statxplore import objects

# List all data schemas
objects.Schema.list(session)
# Get info about a schema
objects.Schema('str:folder:fuc').get(session)
# Get the schema of a specific table
objects.Schema('str:database:UC_Monthly').get(session)

Table examples

The /table endpoint allows you to submit table queries and receive the results. The body of the request contains your query.

# Retrieve the number of people on Universal Credit broken down by month
objects.Table('str:database:UC_Monthly').run_query(session,
    measures=['str:count:UC_Monthly:V_F_UC_CASELOAD_FULL'],
    dimensions=[['str:field:UC_Monthly:F_UC_DATE:DATE_NAME']],
)

It's also possible to use JSON to define a query. This is useful for replicating queries generated by the Stat-Xplore graphical user interface. (In Table View, go to Download Table and select "Open Data API Query (.json)" then click Go.)

query = """{
  "database" : "str:database:DLA_In_Payment_New",
  "measures" : [
    "str:count:DLA_In_Payment_New:V_F_DLA_In_Payment_New",  
    "str:statfn:DLA_In_Payment_New:V_F_DLA_In_Payment_New:CAWKLYAMT:MEAN" ],
  "dimensions" : [
    [ "str:field:DLA_In_Payment_New:V_F_DLA_In_Payment_New:COA_CODE" ],
    [ "str:field:DLA_In_Payment_New:F_DLA_QTR_New:DATE_NAME" ]
  ]
}"""
data = objects.Table.query_json(session, query)

Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statxplore-0.0.1.post1.tar.gz (9.6 kB view hashes)

Uploaded Source

Built Distribution

statxplore-0.0.1.post1-py3-none-any.whl (9.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page