Skip to main content

This Python package provides tools for analyzing and processing data related to Severe Acute Respiratory Syndrome (SARS) and other respiratory viruses. It includes functions for data preprocessing, feature engineering, and training Gradient Boosting Models (GBMs) for binary or multiclass classification.

Project description

PySRAG

This Python package provides tools for analyzing and processing data related to Severe Acute Respiratory Syndrome (SARS) and other respiratory viruses. It includes functions for data preprocessing, feature engineering, and training Gradient Boosting Models (GBMs) for binary or multiclass classification.

Getting Started

These instructions will help you get started with using the PySRAG package.

Prerequisites

Before you begin, ensure you have met the following requirements:

  • Python 3 installed
  • Required Python packages (you can install them using pip):
    • pandas==1.5.3
    • numpy==1.23.5
    • scikit-learn==1.2.2
    • lightgbm==4.0.0

Installation

You can install the PySRAG package using pip:

pip install PySRAG

Usage

Here's an example of how to use the SRAG package:

from pysrag.data import SRAG
from pysrag.model import GBMTrainer

# from https://opendatasus.saude.gov.br/dataset/srag-2021-a-2024
filepath = 'https://s3.sa-east-1.amazonaws.com/ckan.saude.gov.br/SRAG/2023/INFLUD23-16-10-2023.csv' 

# Initialize the SRAG class
srag = SRAG(filepath)

# Generate training data
inputs = ['REGIAO_LATITUDE', 'REGIAO_LONGITUDE', 'UF_LATITUDE'
        , 'UF_LONGITUDE', 'LATITUDE', 'LONGITUDE', 'POPULACAO', 'IDADE_ANO'
        , 'ANO_SEM_SIN_PRI']
target = ['POS_SARS2', 'POS_FLUA', 'POS_FLUB', 'POS_VSR']
residual_viruses = ['POS_PARA1', 'POS_PARA2', 'POS_PARA3', 'POS_PARA4',
                    'POS_ADENO', 'POS_METAP', 'POS_BOCA', 'POS_RINO', 'POS_OUTROS']

X, y = srag.generate_training_data(objective='multiclass', cols_X=inputs, col_y=target, residual_viruses=residual_viruses)

# Train a Gradient Boosting Model
trainer = GBMTrainer(objective='multiclass', eval_metric='multi_logloss')
trainer.fit(X, y)

# Get Prevalences
trainer.model.predict_proba(X)
array([[9.73523796e-05, 8.91182790e-04, 1.21236644e-01, 8.64260161e-01, 1.35146598e-02],
       [4.71281550e-03, 2.36337464e-05, 9.59325690e-01, 2.72306200e-02, 8.70724046e-03],
       [6.95816743e-04, 3.35154571e-05, 2.81288034e-04, 9.98876481e-01, 1.12898420e-04],
       ...,
       [4.62475587e-03, 2.82325172e-03, 3.81832162e-03, 1.39748287e-01, 8.48985384e-01],
       [4.62475587e-03, 2.82325172e-03, 3.81832162e-03, 1.39748287e-01, 8.48985384e-01],
       [1.13695780e-02, 1.17825387e-03, 1.04659501e-02, 9.74318052e-01, 2.66816576e-03]])

Web Application

The PySRAG package includes a web application that allows users to interactively explore data related to Severe Acute Respiratory Syndrome (SARS) in Brazil. This web-based interface provides a practical way for users to visualize data without needing deep technical knowledge of Python or the underlying code.

Accessing the Web Application

To access the web application, visit:

PySRAG Web App

This link will take you to a hosted version of our application, equipped with preloaded data and features for easy exploration.

Features

The web application offers the following features:

  • Data Visualization: Interactive graphs display processed data, giving insights into the distribution of respiratory viruses.
  • Data Filtering: Users can apply filters based on city and patient age to narrow down the data and focus on specific demographics or regions.

How to Use

  1. Navigate to the Dashboard: Start on the dashboard, which provides an overview of the visualizations.
  2. Apply Filters: Use the filtering options to select specific cities or age ranges to view customized data visualizations.
  3. Explore Visualizations: Interact with the visual data representations to gain deeper insights into the trends and patterns.

Support

If you encounter any issues while using the web application or have suggestions for improvements, please submit an issue on our GitHub page.

This web application is designed to make the data analysis capabilities of the PySRAG package accessible to both technical and non-technical users, enhancing understanding and facilitating research on respiratory viruses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysrag-0.4.0.tar.gz (1.7 MB view details)

Uploaded Source

File details

Details for the file pysrag-0.4.0.tar.gz.

File metadata

  • Download URL: pysrag-0.4.0.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for pysrag-0.4.0.tar.gz
Algorithm Hash digest
SHA256 48bd5b37860cee0c92c64a2da16f84645c122b67d2cc69277f39ee8df84583fc
MD5 938a67c1b98c8c71e8de1383515dfc8a
BLAKE2b-256 1b61179b449851b483251f6b2d86ed02828b5c79f8ba0aa78281cabaaf182796

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page