This is a data harvester for the Department for Work and Pensions Stat-Explore
Project description
DWP Harvester
This is a data harvester for the Department for Work and Pensions (DWP) Stat-Explore system. See Stat-Xplore : Open Data API documentation. This API is based on the SuperSTAR 9.5 Open Data API by WingArc1st.
Installation
To install from the Python Package Index (PyPi):
pip install statxplore
Authentication
The harvester authenticates against the remote server using an access token.
To use the harvester you must register an account on Stat-Xplore. When you're logged in, click the menu button (the three dots in the top-right corner) and click "Account". The string under "Open Data API Access" which is your API access key/token.
This token must be input to the harvester either in the command line or from a file (the default path is ~/configs/stat_explore.txt
) as shown below:
statxplore --api_key <my_access_token>
# or
statxplore --api_key_path ~/configs/stat_explore.txt
Usage
To view the available commands and options, run the following command:
statxplore --help
Querying
To get a query specification in JSON format, visit Stat-Xplore, log in and select a data set. Choose a table and click "Open Table." You may customise the rows and columns as needed. Next, click the "Download Table" field and select "Open Data API Query (.json)" This JSON file can be used to define a query as shown below.
Run the following command to execute a query and output the result to a CSV file:
statxplore -o test.csv -q queries\relative-low-income-by-year-sheffield.json
Where -o
(--output
) is the output CSV file path, -q
(--query
) is the query JSON file.
To generate the CSV headers that will result from a particular query, use the --csv
(-c
) flag:
statxplore -o test.csv -q queries\relative-low-income-by-year-sheffield.json -c
Code documentation
An authenticated HTTP session is required to communicate with the API.
from statxplore import http_session
session = http_session.StatSession(api_key='<access_token>')
API objects
The subclasses of objects.StatObject
are thin wrappers around the API endpoints. Please refer to the API documentation.
Schema
The [/schema endpoint](/schema endpoint) returns information about the Stat-Xplore datasets that are available to you, and their fields and measures.
The root endpoint, /schema
, returns details of all datasets and folders at the root level of Stat-Xplore.
from statxplore import objects
# List all data schemas
objects.Schema.list(session)
# Get info about a schema
objects.Schema('str:folder:fuc').get(session)
# Get the schema of a specific table
objects.Schema('str:database:UC_Monthly').get(session)
Table examples
The /table
endpoint allows you to submit table queries and receive the results. The body of the request contains your query.
# Retrieve the number of people on Universal Credit broken down by month
objects.Table('str:database:UC_Monthly').run_query(session,
measures=['str:count:UC_Monthly:V_F_UC_CASELOAD_FULL'],
dimensions=[['str:field:UC_Monthly:F_UC_DATE:DATE_NAME']],
)
It's also possible to use JSON to define a query. This is useful for replicating queries generated by the Stat-Xplore graphical user interface. (In Table View, go to Download Table and select "Open Data API Query (.json)" then click Go.)
query = """{
"database" : "str:database:DLA_In_Payment_New",
"measures" : [
"str:count:DLA_In_Payment_New:V_F_DLA_In_Payment_New",
"str:statfn:DLA_In_Payment_New:V_F_DLA_In_Payment_New:CAWKLYAMT:MEAN" ],
"dimensions" : [
[ "str:field:DLA_In_Payment_New:V_F_DLA_In_Payment_New:COA_CODE" ],
[ "str:field:DLA_In_Payment_New:F_DLA_QTR_New:DATE_NAME" ]
]
}"""
data = objects.Table.query_json(session, query)
Resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for statxplore-0.0.1.post1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b76070f7006d0f9e96d9f9c7c3191f744d74c73466afe4bb5e6d10a810a41619 |
|
MD5 | 39764eb3faf5d52ad1877bd3ed36d955 |
|
BLAKE2b-256 | fa04206ab327412fdd6fa080fadd05e5a18445d909944cd26d186aa66f49c0b7 |