Skip to main content

Python client for accessing the COVID County Data database

Project description

covidcountydata

Welcome to the Python client library documentation for the Covid County Data (CCD) database.

Installation

The covidcountydata Python package is available on the Python Package Index (pypi) and can be installed with pip.

pip install covidcountydata

API keys

Our data is free and open for anyone to use (and always will be). Our team agreed that this was central to our mission when we agreed to begin this project. However, we do find it useful to have information about our users and to see how they use the data for two reasons:

  1. It helps us focus and improve the datasets that are seeing the most use.
  2. The number of users, as measured by active API keys, is one metric that we use to show that the project is useful when we are discussing additional grant funding.

We are grateful to everyone who is willing to register for and use their API key when interacting with our data.

To register for an API key, you can register on our website or from the Python package using the register method.

from covidcountydata import Client

c = Client()
c.register()

You will be prompted for your email address. After entering a valid email address we will issue an API key, store it on your machine, and automatically apply it to all future requests made from Python to our servers.

If at any time you would like to remove your API key, please delete the file ~/.covidcountydata/apikey.

Data

Datasets

You can see a list of the available datasets in our API from the Python library by doing:

from covidcountydata import Client

c = Client()
print(c.datasets)

For more information on each of these datasets, we recommend that you visit our data documentation page.

Data keys

Many of the datasets in our database are indexed by one or more common "keys". These keys are:

  • vintage: The date and time that the data was downloaded into our database. We collect this because of the rapidly evolving nature of COVID-19 -- It allows us to have a record of when data was changed/corrected/updated.
  • dt: The date and time that an observation corresponds to. For series like COVID tests administered this may a daily frequency, but, for others like unemployment it may be a weekly or monthly frequency.
  • location: A geographic identifier for the location. For the counties/states in the dataset, this variable corresponds to the Federal Information Processing Standards number.

Whenever two series with common keys are loaded together, they will be merged on their common keys.

Requesting data

Requesting data using the Python client library involves three steps:

1. Create a client

To create a client, use the Client class.

from covidcountydata import Client

c = Client()

You can optionally pass in an API key if you have one (see the section on API keys).

c = Client("my api key")

If you have previously registered for an API key on your current machine, it will be loaded and used automatically for you.

In practice you should rarely need to pass the API key by hand unless you are loading the key from an environment variable or another source.

2. Build a request

Each of the datasets in the API have an associated method.

To add datasets to the current request, call the Client.dataset()method. For example, to add the covid_us dataset to the request, you would call:

c.covid_us(state="CA")

If you wanted to add another dataset, such as demographics, you would simply call that method as well.

c.demographics()

You can see that the printed form of the client is updated to show you what the current request looks like by printing the current client.

print(c)

To clear the current request, use c.reset():

Since each dataset will build up a request for the client and return the client itself, we can chain together multiple requests. For example, rather than doing the separate commands from above, we could have done.

c.covid_us(state="CA").demographics()

Filtering data

Each of the dataset functions has a number of filters that can be applied.

These filters allow you to select certain rows and/or columns.

For example, in the above example we had c.covid_us(state="CA"). This instructs the client to only fetch data for geographic regions that are in the state of California.

NOTE: If a filter is passed to one dataset in the request but is applicable to other datasets in the request, it will be applied to all datasets.

For example in c.covid_us(state="CA").demographics() we only specify a state filter on the covid_us dataset, but when the data is collected it will also be applied to demographics.

We do this because we end up doing an inner join on all requested datasets, so when we filter the state in covid_us they also get filtered in demographics.

3. Fetch the data

To fetch the data, call the fetch method from the client.

df = c.fetch()

Note that after a successfully request, the client is reset so there are no "built-up" requests remaining.

Examples

We provide a few simple examples here in the README, but you can find additional examples in the examples folder.

Simple Example: Single dataset for all FIPS

The example below loads all within county mobility data.

import covidcountydata as ccd
c = ccd.Client()

c.mobility_devices()
df = c.fetch()

Simple Example: Single dataset for single county

The example below loads just demographic information for Travis County in Texas.

Notice that we can select a particular geography by specifying the fips code. We can do similar things for any of the keys listed previously.

c = ccd.Client()
c.demographics(location=48453)
df = c.fetch()

Simple Example: Single dataset for all counties in a state

The example below loads just demographic information for all counties in Texas.

Notice that we can select a particular geography by specifying the fips code. We can do similar things for any of they keys listed previously.

c = ccd.Client()
c.demographics(state=48)
df = c.fetch()

Simple Example: COVID data since July 1, 2020 for counties in Georgia

The example below shows how to filter on both a state and a range of dates, fetching only data for July 1st onwards.

c = ccd.Client()
c.covid_us(state="GA", dt=">=2020-07-01")
df = c.fetch()

Intermediate Example: COVID data since July 1, 2020 for counties in Georgia with county info

This is the same example as above, but we also add on the us_counties endpoint. This will cause the returned DataFrame to have information on the counties including county name, state name, county area, county latitude, and county longitude.

c = ccd.Client()
c.covid_us(state="GA", dt=">=2020-07-01").us_counties()
df = c.fetch()

Intermediate Example: Multiple datasets for single county

The example below loads covid and demographic data and showcases how to chain calls to multiple datasets together. It will automatically merge and return these datasets.

Note that applying a filter to any of the datasets (in this case fips=6037) will apply it to all datasets.

c = ccd.Client()
(
    c
    .covid_us(location=6037)
    .demographics()
)
df = c.fetch()

Advanced Example: Multiple datasets with multiple filters and variable selection

The example below loads data from three datasets for a particular FIPS code, using a particular date of demographics, and selects certain variables from the datasets.

c = ccd.Client()
(
    c
    .economic_snapshots(variable="GDP_All industry total")
    .covid_us(location=6037)
    .demographics(variable="Total population")
)
df = c.fetch()

There are more examples in the covidcountydata/examples.py file. We encourage you to explore them and to reach out if you have questions!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

covidcountydata-0.8.0.tar.gz (14.6 kB view details)

Uploaded Source

File details

Details for the file covidcountydata-0.8.0.tar.gz.

File metadata

  • Download URL: covidcountydata-0.8.0.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.1.post20200802 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.8.5

File hashes

Hashes for covidcountydata-0.8.0.tar.gz
Algorithm Hash digest
SHA256 6a7beace159f0223f35dfe94b0901802f16f7300be5b129b9a5307d6ecd4d4c0
MD5 824cacf6ec08a35f0f08643a2b2933ef
BLAKE2b-256 66c46a9aeb5bc1a8099c0eea29be846373954799ca67099a7dbec32752aa61af

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page