An interface for visualizing and analysing the see19 dataset

These details have not been verified by PyPI

Project links

Homepage

Project description

see19

An aggregation dataset and interface for visualizing and analyzing Coronavirus Disease 2019 aka COVID19 aka C19

Dataset Last Updated June 12, 2020

May 31, 2020 Update

Upgrade to version 0.3 is complete. Please exercise caution if switching to this version as there have been a number of significant changes / additions that might impact your prior work.

SUMMARY OF UPDATES

1. Testset Graduation

Test counts and Apple mobility data have been moved into the main dataset.
- Reporting on testing continues to be inconsistent around the world. Many countries have only just begun reporting and many report on an infrequent basis (weekly or worse). Where there are gaps in daily figures, non-linear interpolation is used to smooth figures. Several key regions including Brazil and France have very minimal data at all.

2. Added filter functionality
When instanting a CaseStudy instance:

You can now pass any of region_id, region_code, or region_name to regions/exclude_regions in a single iterable. region_code column has been added, and is either simply a replica of country_code or the accepted abbreviation of the province or state. i.e. Alberta's region_code is AB.
country_code and country_id now also acceptable in countries/exclude_countries
pandas Series and numpy arrays are now acceptable iterables for these filters as well.

3. Miscellaneous

To access the testset via get_baseframe, set test=True
Added progress bar for get_baseframe() (a couple hours I won't ever get back)
Additional styling attributes to most chart make() functions
Added exception to catch when a country_w_sub is provided as region when country_level=False
when USA is filter via countries, see19 now automatically excludes the country of Georgia. This was a major personal irritant of mine, but if you have the need you can simply include Georgia in countries as well.

Latest Analysis

How Effective Is Social Distancing?

What Factors Are Correlated With COVID19 Fatality Rates?

The COVID Dragons

The Dataset

The dataset is in csv format and can be found here

You can find relevant statistics and detailed sourcing in the Guide

The Package

the see19 package is available on pypi and can be installed as follows:

pip install see19

The package provides a helpful pandas-based interface for working with the dataset. It also provides several visualization tools

The Guide

The Guide details data sources, structure, functionality, and visualization tools.

Purpose

"It is better to be vaguely right than exactly wrong."

- Carveth Read, Logic, Chapter 22

see19 is an early stage attempt to aggregate various data sources and analyze their impact (together and in isolation) on the virulence of SARS-CoV2.

Ease-of-use is paramount, thus, all data from all sources have been compiled into a single structure, readily consumed and manipulated in the ubiquitous csv format.

see19 aggregates the following data:

COVID19 Data Characteristics:
- Cumulative Case, Fatality, and Testing statistics for each region on each date
- State / Provincial-level data available for
Factor Data Characteristics available for most regions include:
- Longitude / Latitude, Population, Demographic Segmentation, Density
- Climate Characteristics including temperatue and uvb radiation
- Historical Health Outcomes
- Travel Popularity
- Social Distancing Implementation
- And more and counting ...

There is no single all-encompassing data from an undoubted source that will serve the needs of every user for every use case. Thus, the dataset as it stands is an ad-hoc aggregation from multiple sources with eyeball-style approximations used in some instances. But while the dataset's imperfections are numerous, they cannot blunt the power of the insights that can be gleaned from an early exploratory analysis.

In addition to the dataset, see19 is a python package that provides:

Helpful pandas-based interface for manipulating the data
Visualization tools in bokeh and matplotlib to compare factors across multiple dimensions ..
Statistical analysis is also a goal of the project and I expect to add such analysis tools as time progresses. Until then, the data is available for all.

Suggestions For Additional Data

I am always on the hunt for new additions to the dataset. If you have any suggestions, please contact me. Specifically, if you are aware of any datasets that might integrate nicely with see19 in the following realms:

German daily, state-level counts
Russian daily, state-level counts
India daily, sate-level counts
State or city level travel data
Global Commercial Airline route data (there seems to be plenty available, except only for a whopping price)

Quick Demo

You can very quickly use see19 to develop visuals for COVID19 analysis and presentation.

The see19 package can be installed via pip.

pip install see19

Then simply:

# Required to use Bokeh with Jupyter notebooks
from bokeh.io import output_notebook, show
output_notebook()

Loading BokehJS ...

from see19 import get_baseframe, CaseStudy
baseframe = get_baseframe()

regions = ['Germany', 'Spain']
casestudy = CaseStudy(baseframe, regions=regions, count_categories='deaths_new_dma_per_1M')

label_offsets = {'Germany': {'x_offset': 8, 'y_offset': 8}, 'Spain': {'x_offset': 5, 'y_offset': 5}}  
p = casestudy.comp_chart.make(comp_type='multiline', label_offsets=label_offsets, width=750)

show(p)

Bokeh

%matplotlib inline

regions = list(baseframe[baseframe['country'] == 'Brazil'] \
    .sort_values(by='population', ascending=False) \
    .region_name.unique())[:20]

casestudy = CaseStudy(
    baseframe, count_dma=5, 
    factors=['temp'],
    regions=regions, start_hurdle=10, start_factor='cases', lognat=True,
)
kwargs = {
    'color_factor': 'temp',
    'fs_xticks': 16, 'fs_yticks': 12, 'fs_zticks': 12,
    'fs_xlabel': 12, 'fs_ylabel': 18, 'fs_zlabel': 18,
    'title': 'Daily Deaths in Brazil as of May 2',
    'x_title': 0.499, 'y_title': 0.738, 'fs_title': 22, 'rot_title': -9.5,
    'x_colorbar': 0.09, 'y_colorbar': .225, 'h_colorbar': 20, 'w_colorbar': .01, 
    'a_colorbar': 'vertical', 'cb_labelpad': -57,
    'tight': True, 'abbreviate': 'first', 'comp_size': 10,
}
p = casestudy.comp_chart4d.make(comp_category='deaths_new_dma_per_1M', **kwargs)

png

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.4rc0 pre-release

Aug 2, 2020

0.4b0 pre-release

Aug 2, 2020

0.4a0 pre-release

Aug 2, 2020

This version

0.3.5

Jun 12, 2020

0.3.3

Jun 7, 2020

0.3.2

May 31, 2020

0.3.1

May 31, 2020

0.3.0

May 31, 2020

0.2.0

May 11, 2020

0.1.8

May 8, 2020

0.1.7

May 6, 2020

0.1.6

May 4, 2020

0.1.5

May 4, 2020

0.1.4

May 4, 2020

0.1.3

May 4, 2020

0.1.0

Jun 13, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

see19-0.3.5.tar.gz (29.7 kB view details)

Uploaded Jun 12, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

see19-0.3.5-py3-none-any.whl (39.4 kB view details)

Uploaded Jun 13, 2020 Python 3

File details

Details for the file see19-0.3.5.tar.gz.

File metadata

Download URL: see19-0.3.5.tar.gz
Upload date: Jun 12, 2020
Size: 29.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for see19-0.3.5.tar.gz
Algorithm	Hash digest
SHA256	`54580274c5309c711fa7ec7f84c35da2e768c29f66283cc9c409d276c0ed16b0`
MD5	`362a18ab8591ee2d1f59328866fed9a5`
BLAKE2b-256	`af0057ecb0505ad881617e5af491d9f41e6dcf3deed6796afae7c3190811e946`

See more details on using hashes here.

File details

Details for the file see19-0.3.5-py3-none-any.whl.

File metadata

Download URL: see19-0.3.5-py3-none-any.whl
Upload date: Jun 13, 2020
Size: 39.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for see19-0.3.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`70258daf3c7c8cadc4bc4c0cb8e6a564b911d0f8c4e7e1e5d40ac3d3dbfc3c31`
MD5	`29238b2cf392f82c768de02d949c194f`
BLAKE2b-256	`5c93875cfa46df8bb7a0a4f2842b48a7f222619f9f177874d56ad27ecb98898a`

See more details on using hashes here.

see19 0.3.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

see19

May 31, 2020 Update

Latest Analysis

The Dataset

The Package

The Guide

Purpose

"It is better to be vaguely right than exactly wrong."

Suggestions For Additional Data

Quick Demo

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes