Skip to main content

A Python package for working with datasets from the UK Data Service (UKDS)

Project description

ukds

A Python package for working with datasets from the UK Data Service (UKDS).

Any problems? Please raise an Issue on GitHub

To install:

pip install ukds

Quick Demo

(This demonstration uses the following dataset: Gershuny, J., Sullivan, O. (2017). United Kingdom Time Use Survey, 2014-2015. Centre for Time Use Research, University of Oxford. [data collection]. UK Data Service. SN: 8128, http://doi.org/10.5255/UKDA-SN-8128-1)

The following code reads a UK Data Service .tab data file and its associated .rtf data dictionary file, and converts them to a Pandas DataFrame:

import ukds
dt=UKDS.DataTable(fp_tab=r'.../uktus15_household.tab'
                  fp_dd=r'.../uktus15_household_ukda_data_dictionary.rtf')
df=dt.get_dataframe()

The DataFrame looks like this:

dataframe_screenshot

User Guide

The ukds package provides two classes:

The DataTable class

The DataTable class converts a UKDS .tab data file and .rtf data dictionary file into a single Pandas DataFrame ready for further analysis.

Importing the DataTable class

from ukds import DataTable

Creating an instance of DataTable and reading in the data file and the datadictionary file

Either:

dt=DataTable()
dt.read_tab(r'.../uktus15_household.tab')
df.read_datadictionary(r'.../uktus15_household_ukda_data_dictionary.rtf')

or:

dt=DataTable(fp_tab=r'.../uktus15_household.tab',
             fp_dd=r'.../uktus15_household_ukda_data_dictionary.rtf')

Attributes

As the files are read in, a number of attributes are populated. These are:

dt.tab				# a pandas.DataFrame object
dt.datadictionary	# a ukds.DataDictionary object

get_dataframe method

The method get_dataframe is available which converts the information in the tab and datadictionary attributes into a new pandas DataFrame.

dt=df.get_dataframe()

See the datatable_demo.ipynb Jupyter Notebook in the 'demo' section for more information.

The DataDictionary class

The DataDictionary class provides access to UKDS .rtf data dictionary files.

Importing the DataDictionary class

from ukds import DataDictionary

Creating an instance of DataTable and reading in the data file and the datadictionary file

Either:

dd=DataDictionary()
dd.read_rtf(r'.../uktus15_household_ukda_data_dictionary.rtf')

or:

dd=DataDictionary(fp_dd=r'.../uktus15_household_ukda_data_dictionary.rtf')

Attributes

As the file are read in, a number of attributes are populated. These are:

dt.rtf				# a string of the raw contents of the rtf file
dt.variablelist		# a list of dictionaries with the variable information

get_variable_dict method

Returns a dictionary with the information for a single variable. For example:

serial=dd.get_variable_dict('serial')

returns:

{'pos': '1',
 'variable': 'serial',
 'variable_label': 'Household number',
 'variable_type': 'numeric',
 'SPSS_measurement_level': 'SCALE',
 'SPSS_user_missing_values': '',
 'value_labels': ''}

get_variable_names method

Returns a list of the variable names:

dd.get_variable_names()

See the datadictionary_demo.ipynb Jupyter Notebook in the 'demo' section for more examples based on this class.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ukds-0.0.1.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

ukds-0.0.1-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file ukds-0.0.1.tar.gz.

File metadata

  • Download URL: ukds-0.0.1.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for ukds-0.0.1.tar.gz
Algorithm Hash digest
SHA256 1744c44cbe221e65252c1badbff80cad9010d1841611dbc157bb949c891263a4
MD5 e83f1addd0878406e8a20fe1549bf960
BLAKE2b-256 e5d31c2b5b57b0a0f5dbbecf060dd486fa9d20226a486bad6cf5cda27b7dbacb

See more details on using hashes here.

File details

Details for the file ukds-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: ukds-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for ukds-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1138a585fba5f0a00b43235b14ff72357d982c245da8ce638a4706ec84ea134d
MD5 8d5eadd9f0eff4034c7804680e44e523
BLAKE2b-256 5207cfad686abcbf1830b3c301726e8067be18c3ac7712167f277ef64d618bfe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page