Skip to main content

Data Story Pattern Analysis for LOSD

Project description

DataStoryPatternLibrabry

Data Story Patterns Library is a repository with pattern analysis designated for Linked Open Statistical Data. Story Patterns were retrieved from literture reserach udenr general subject of "data journalism".

Installation

pip install datastories

Requirements will be automatically installed with package

###Import/Usage

import datastories.analytical as patterns

patterns.DataStoryPattern(sparqlendpointurl, jsonmetadata)

Object created allow to query SPARQL endpoint based on JSON meatadat provided

Patterns Description

MCounting

Measurement and Counting Arithemtical operators applied to whole dataset - basic information regarding data

Attributes

MCounting(self,cube="",dims=[],meas=[],hierdims=[],count_type="raw",df=pd.DataFrame() )
Parameter Type Description
cube String Cube, which dimensions and measures will be investigated
dims list[String] List of dimensions (from cube) to take into investigation
meas list[String] List of measures (from cube) to take into investigation
hierdims dict{hierdim:{"selected_level":[value]}} Hierarchical Dimesion with selected hierarchy level to take into investigation
count_type String Type of Count to perform
df DataFrame DataFrame object, if data is already retrieved from endpoint

Output

Based on count_type value

Count_type Description
raw data without any analysis performed
sum sum across all numeric columns
mean mean across all numeric columns
min minimum values from all numeric columns
max maximum values from all numeric columns
count amount of records

LTable

LeagueTable - sorting and extraction specific amount of records

Attributes

LTable(self,cube=[],dims=[],meas=[],hierdims=[], columns_to_order="", order_type="asc", number_of_records=20,df=pd.DataFrame())
Parameter Type Description
cube String Cube, which dimensions and measures will be investigated
dims list[String] List of dimensions (from cube) to take into investigation
meas list[String] List of measures (from cube) to take into investigation
hierdims dict{hierdim:{"selected_level":[value]}} Hierarchical Dimesion with selected hierarchy level to take into investigation
columns_to_order list[String] Set of columns to order by
order_type String Type of order (asc/desc)
number_of_records Integer Amount of records to retrieve
df DataFrame DataFrame object, if data is already retrieved from endpoint

Output

Based on sort_type value

Count_type Description
asc ascending order based on columns provided in columns_to_order
desc descending order based on columns provided in columns_to_order

InternalComparison

InternalComparison - comparison of numeric values related to textual values within one column

Attributes

def InternalComparison(self,cube="",dims=[],meas=[],hierdims=[],df=pd.DataFrame(), dim_to_compare="",meas_to_compare="",comp_type=""):
Parameter Type Description
cube String Cube, which dimensions and measures will be investigated
dims list[String] List of dimensions (from cube) to take into investigation
meas list[String] List of measures (from cube) to take into investigation
hierdims dict{hierdim:{"selected_level":[value]}} Hierarchical Dimesion with selected hierarchy level to take into investigation
df DataFrame DataFrame object, if data is already retrieved from endpoint
dim_to_compare String Dimension, which values will be investigated
meas_to_compare String Measure, which numeric values related to dim_to_compare will be processed
comp_type String Type of comparison to perform

Output

Independent from comp_type selected, output data will have additional column with numerical column meas_to_compare processed in specific way.

Available types of comparison comp_type

Comp_type Description
diffmax difference with max value related to specific textual value
diffmean difference with arithmetic mean related to specific textual values
diffmin difference with minimum value related to specific textual value

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datastories-0.1.1.8.tar.gz (4.5 kB view details)

Uploaded Source

File details

Details for the file datastories-0.1.1.8.tar.gz.

File metadata

  • Download URL: datastories-0.1.1.8.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for datastories-0.1.1.8.tar.gz
Algorithm Hash digest
SHA256 4c99274812534167de92e8c6200175b7a7d5e8837345ddcb20521fe474863d3f
MD5 bd6f32d2153da44f2aa9884b27b11b8c
BLAKE2b-256 ee490ab2dfe9063678938a493e0406e11bb35ea0da22417ee95e09a9e385ed75

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page