Download CSO Ireland datasets and data catalogue as Pandas dataframes.
Project description
CSO Ireland Data
==============================
Easily download data from the CSO PxStat API as Pandas datasets.
Uses requests-cache for super fast access to cached requests and easy persistence with multiple storage backends.
Installation
To install, just use pip:
pip install cso-ireland-data
Usage
Getting started
First, set up a CSODataSession
.
-
By default, this is really simple.
from cso_ireland_data import CSODataSession cso = CSODataSession()
-
If you want to add caching, no problem! All the functionality of the
requests-cache
package is available throughcached_session_params
.from datetime import timedelta from cso_ireland_data import CSODataSession cso = CSODataSession( cached_session_params={ "use_cache_dir": True, # Save files in the default user cache dir "cache_control": True, # Use Cache-Control response headers for expiration, if available "expire_after": timedelta(days=1), # Otherwise expire responses after one day } )
-
Stuck behind a corporate firewall that causes SSL certificate issues? Also no problem! All the functionality of the
requests
get()
method is available throughrequest_params
.from cso_ireland+data import CSODataSession # Tell requests.get() it's ok not to verify SSL certificates when getting data. # !!! Only do this if you're absolutely sure it's what you need !!! cso = CSODataSession(request_params={"verify": False})
Getting the data catalogue
To get a catalogue (Table of Contents) of all the datasets that are available through the API, use get_toc()
.
NB Requests for the ToC sometimes time out on the CSO API. IF this happens, try again!
cso.get_toc()
table_id | table_name | last_updated | copyright | exceptional | frequency | earliest | latest | variables |
---|---|---|---|---|---|---|---|---|
A0101 | 1996 Population and Percentage Change 1991 and 1996 | 2020-05-01 11:00:00+00:00 | Central Statistics Office, Ireland | False | CensusYear | 1996 | 1996 | ['Province County or City'] |
A0102 | Population at Each Census Since 1841 | 2020-05-01 11:00:00+00:00 | Central Statistics Office, Ireland | False | CensusYear | 1841 | 1996 | ['Province or County', 'Sex'] |
A0103 | Population | 2020-05-01 11:00:00+00:00 | Central Statistics Office, Ireland | False | CensusYear | 1996 | 1996 | ['Province County or City', 'Sex', 'Aggregate Town or Rural Area'] |
A0104 | Population | 2020-06-03 11:00:00+00:00 | Central Statistics Office, Ireland | False | CensusYear | 1996 | 1996 | ['Sex', 'Regional Authority'] |
A0105 | 1996 Population and Percentage Change 1996 and 2002 | 2021-07-19 11:00:00+00:00 | Central Statistics Office, Ireland | False | CensusYear | 1996 | 1996 | ['Towns by Electoral Division'] |
Getting a table using its ID code
To get the whole contents of a particular table hosted on the Statbank API, use get_table()
.
You just need to know the ID code of the table, which you can look up using get_toc()
.
wpm29 = cso.get_table("WPM29")
wpm29.head()
Wholesale Price Index (Excl VAT) for Energy Products | |
---|---|
('Autodiesel', '2015M01') | 96.7 |
('Autodiesel', '2015M02') | 102 |
('Autodiesel', '2015M03') | 103 |
('Autodiesel', '2015M04') | 102.9 |
('Autodiesel', '2015M05') | 104.6 |
Getting some common tables quickly
The CSODataSession
class includes some useful methods to get data from commonly accessed tables quickly.
Monthly Consumer Price Index (CPI)
By default, the monthly_cpi()
method returns a single column corresponding to the 'All items' headline CPI in the source table.
Also by default, this index is re-normalized to the most recent month - you can toggle this by setting normalize_to_most_recent
to False
.
simple_cpi = cso.monthly_cpi()
simple_cpi.tail()
Month | All items |
---|---|
2022-04-01 00:00:00 | 0.9725 |
2022-05-01 00:00:00 | 0.981 |
2022-06-01 00:00:00 | 0.9937 |
2022-07-01 00:00:00 | 0.9986 |
2022-08-01 00:00:00 | 1 |
It's also possible to pass a list of commodity groups:
commmodity_group_cpi = cso.monthly_cpi(
commodity_groups=[
"All items",
"Alcoholic beverages and tobacco",
"Health",
"Recreation and culture",
]
)
commodity_group_cpi.tail()
Month | All items | Alcoholic beverages and tobacco | Health | Recreation and culture |
---|---|---|---|---|
2022-04-01 00:00:00 | 0.9725 | 0.9738 | 0.9828 | 0.9945 |
2022-05-01 00:00:00 | 0.981 | 0.9937 | 0.9851 | 0.9954 |
2022-06-01 00:00:00 | 0.9937 | 0.9958 | 0.9874 | 0.9973 |
2022-07-01 00:00:00 | 0.9986 | 0.9969 | 0.9874 | 0.9991 |
2022-08-01 00:00:00 | 1 | 1 | 1 | 1 |
Live Register
Use the live_register()
method to get Live Register numbers (optionally broken down by Age Group and Sex) by month. This is a long data series, starting in April 1967 and still continuing every month, so it may be convenient to specify a start
and/or end
date for the data returned.
The Live Register data series is based on a monthly point-in-time count of people who have active Jobseeker claims with the Department of Social Protection (DSP), and these counts are extracted from DSP's administrative computer systems on a particular day every month.
Because of this, live_register()
returns three possibly useful dates for each month:
- 'Month' is the index of the data frame. It's just the last day of each calendar month.
- 'reference_date' is the date of the point-in-time count of people with active Jobseeker claims. It's the last Friday of each month before May 2015, and the last Thursday of the month from then on.
- 'extract_date' is the date on which the source administrative data was actually extracted - it's always the Sunday after the reporting_date.
live_register = cso.live_register(start=datetime(2010, 1, 1))
Month | Age Group | Sex | Persons on the Live Register | Persons on the Live Register (Seasonally Adjusted) | reference_date | extract_date | |
---|---|---|---|---|---|---|---|
516 | 2010-04-30 00:00:00 | All ages | Both sexes | 432657 | 440800 | 2010-04-30 00:00:00 | 2010-05-02 00:00:00 |
517 | 2010-08-31 00:00:00 | All ages | Both sexes | 466923 | 444000 | 2010-08-27 00:00:00 | 2010-08-29 00:00:00 |
518 | 2010-12-31 00:00:00 | All ages | Both sexes | 437079 | 446000 | 2010-12-31 00:00:00 | 2011-01-02 00:00:00 |
519 | 2010-02-28 00:00:00 | All ages | Both sexes | 436956 | 439000 | 2010-02-26 00:00:00 | 2010-02-28 00:00:00 |
520 | 2010-01-31 00:00:00 | All ages | Both sexes | 436936 | 439400 | 2010-01-29 00:00:00 | 2010-01-31 00:00:00 |
Life Tables
The life_table()
method by default returns a complete life table for the most recent source data vintage.
life_table = cso.life_table()
life_table.head()
Ix | dx | px | qx | Lx | Tx | e0x | |
---|---|---|---|---|---|---|---|
('Male', 101) | 851 | 616 | 0.724494 | 0.275506 | 543 | 1544 | 1.82 |
('Male', 102) | 616 | 440 | 0.714258 | 0.285742 | 396 | 1002 | 1.63 |
('Male', 103) | 440 | 307 | 0.69799 | 0.30201 | 287 | 605 | 1.38 |
('Male', 104) | 307 | 221 | 0.71911 | 0.28089 | 197 | 319 | 1.04 |
('Male', 105) | 221 | 198 | 0.896101 | 0.103899 | 122 | 122 | 0.55 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cso_ireland_data-0.0.3.tar.gz
.
File metadata
- Download URL: cso_ireland_data-0.0.3.tar.gz
- Upload date:
- Size: 112.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 950a0597945a52885c1d2358a679f2abc88ffff007ca189168ecac2f0d6bd532 |
|
MD5 | e8d673762c19968bdfc3140727ab9c83 |
|
BLAKE2b-256 | 3f188d67bd397bc54fd8e89a4c61efcaf794758070c25052a12a5697ad65b14e |
File details
Details for the file cso_ireland_data-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: cso_ireland_data-0.0.3-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e2b50a6375400daa24e78638e2c4eb9d5baa2a6ecbdb5fabb63b16aac418937 |
|
MD5 | 861e24042ca89b76d8a13b56c7663eb3 |
|
BLAKE2b-256 | 2946ebe7cc04434a3263ca4959134766ff6c2edbcff2d42aed7e8514e4d20f65 |