Data Story Pattern Analysis for LOSD
Project description
DataStoryPatternLibrabry
Data Story Patterns Library is a repository with pattern analysis designated for Linked Open Statistical Data. Story Patterns were retrieved from literture reserach udenr general subject of "data journalism".
Installation
pip install datastories
Requirements will be automatically installed with package
###Import/Usage
import datastories.analytical as patterns
patterns.DataStoryPattern(sparqlendpointurl, jsonmetadata)
Object created allow to query SPARQL endpoint based on JSON meatadat provided
Patterns Description
MCounting
Measurement and Counting Arithemtical operators applied to whole dataset - basic information regarding data
Attributes
MCounting(self,cube="",dims=[],meas=[],hierdims=[],count_type="raw",df=pd.DataFrame() )
Parameter | Type | Description |
---|---|---|
cube | String |
Cube, which dimensions and measures will be investigated |
dims | list[String] |
List of dimensions (from cube) to take into investigation |
meas | list[String] |
List of measures (from cube) to take into investigation |
hierdims | dict{hierdim:{"selected_level":[value]}} |
Hierarchical Dimesion with selected hierarchy level to take into investigation |
count_type | String |
Type of Count to perform |
df | DataFrame |
DataFrame object, if data is already retrieved from endpoint |
Output
Based on count_type value
Count_type | Description |
---|---|
raw | data without any analysis performed |
sum | sum across all numeric columns |
mean | mean across all numeric columns |
min | minimum values from all numeric columns |
max | maximum values from all numeric columns |
count | amount of records |
LTable
LeagueTable - sorting and extraction specific amount of records
Attributes
LTable(self,cube=[],dims=[],meas=[],hierdims=[], columns_to_order="", order_type="asc", number_of_records=20,df=pd.DataFrame())
Parameter | Type | Description |
---|---|---|
cube | String |
Cube, which dimensions and measures will be investigated |
dims | list[String] |
List of dimensions (from cube) to take into investigation |
meas | list[String] |
List of measures (from cube) to take into investigation |
hierdims | dict{hierdim:{"selected_level":[value]}} |
Hierarchical Dimesion with selected hierarchy level to take into investigation |
columns_to_order | list[String] |
Set of columns to order by |
order_type | String |
Type of order (asc/desc) |
number_of_records | Integer |
Amount of records to retrieve |
df | DataFrame |
DataFrame object, if data is already retrieved from endpoint |
Output
Based on sort_type value
Count_type | Description |
---|---|
asc | ascending order based on columns provided in columns_to_order |
desc | descending order based on columns provided in columns_to_order |
InternalComparison
InternalComparison - comparison of numeric values related to textual values within one column
Attributes
def InternalComparison(self,cube="",dims=[],meas=[],hierdims=[],df=pd.DataFrame(), dim_to_compare="",meas_to_compare="",comp_type=""):
Parameter | Type | Description |
---|---|---|
cube | String |
Cube, which dimensions and measures will be investigated |
dims | list[String] |
List of dimensions (from cube) to take into investigation |
meas | list[String] |
List of measures (from cube) to take into investigation |
hierdims | dict{hierdim:{"selected_level":[value]}} |
Hierarchical Dimesion with selected hierarchy level to take into investigation |
df | DataFrame |
DataFrame object, if data is already retrieved from endpoint |
dim_to_compare | String |
Dimension, which values will be investigated |
meas_to_compare | String |
Measure, which numeric values related to dim_to_compare will be processed |
comp_type | String |
Type of comparison to perform |
Output
Independent from comp_type
selected, output data will have additional column with numerical column meas_to_compare
processed in specific way.
Available types of comparison comp_type
Comp_type | Description |
---|---|
diffmax | difference with max value related to specific textual value |
diffmean | difference with arithmetic mean related to specific textual values |
diffmin | difference with minimum value related to specific textual value |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file datastories-0.1.1.8.tar.gz
.
File metadata
- Download URL: datastories-0.1.1.8.tar.gz
- Upload date:
- Size: 4.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c99274812534167de92e8c6200175b7a7d5e8837345ddcb20521fe474863d3f |
|
MD5 | bd6f32d2153da44f2aa9884b27b11b8c |
|
BLAKE2b-256 | ee490ab2dfe9063678938a493e0406e11bb35ea0da22417ee95e09a9e385ed75 |