Socio4health is a Python package for gathering and harmonizing socio-demographic data.
Project description
socio4health
Overview
Package socio4health is an extraction, transformation, loading (ETL) and AI-assisted query and visualization (AI QV) tool designed to simplify the intricate process of collecting and merging data from multiple sources focusing in sociodemografic and census datasets from Colombia, Brasil and Peru, into a unified relational database structure and visualize or querying it using natural language.
- Seamlessly retrieve data from online data sources through web scraping, as well as from local files.
- Support for various data formats, including .csv, .xlsx, .xls, .txt, .sav, and compressed files, ensuring versatility in sourcing information.
- Consolidating extracted data into pandas DataFrame.
- Consolidating transformed data into a cohesive relational database.
- Conduct precise queries and apply transformations to meet specific criteria.
- Using natural language input to query data (Answers from values to subsets)
- Using natural language input to create simple visualizations of data
Dependencies
Installation
You can install the latest version of the package from GitHub using the remotes
package:
# Install using pip
pip install nyctibius
How to Use it
To use the Nyctibius package, follow these steps:
-
Import the package in your Python script:
from socio4health import Harmonizer
-
Create an instance of the
Harmonizer
class:harmonizer = Harmonizer()
-
Extract data from online sources and create a list of data information:
url = 'https://www.example.com' depth = 0 ext = 'csv' list_datainfo = harmonizer.extract(url=url, depth=depth, ext=ext) harmonizer = Harmonizer(list_datainfo)
-
Load the data from the list of data information and merge it into a relational database:
results = harmonizer.load()
-
Import the modifier module and create an instance of the
Modifier
class:from socio4health.db.modifier import Modifier modifier = Modifier(db_path='../../data/output/nyctibius.db')
-
Perfom modifications:
tables = modifier.get_tables() print(tables)
-
Import the querier module and create an instance of the
Querier
class:from socio4health.db.querier import Querier querier = Querier(db_path='data/output/socio4health.db')
-
Perform queries:
df = querier.select(table="Estructura CHC_2017").execute() print(df)
Resources
Package Website
The socio4health website package website includes a function reference, a model outline, and case studies using the package. The site mainly concerns the release version, but you can also find documentation for the latest development version.
Organisation Website
Harmonize is an international develop cost-effective and reproducible digital tools for stakeholders in hotspots affected by a changing climate in Latin America & the Caribbean (LAC), including cities, small islands, highlands, and the Amazon rainforest.
The project consists of resources and tools developed in conjunction with different teams from Brazil, Colombia, Dominican Republic, Peru and Spain.
Organizations
Authors / Contact information
List the authors/contributors of the package and provide contact information if users have questions or feedback.
Diego Irreño (developer)
Erick Lozano (developer)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file socio4health-0.1.0.tar.gz
.
File metadata
- Download URL: socio4health-0.1.0.tar.gz
- Upload date:
- Size: 29.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 970b7fc385f08b310179e9047ea90e31384001eab9a4e10f50db29d994066ef7 |
|
MD5 | 865e19a329dfafb59e514da36cf11251 |
|
BLAKE2b-256 | 8dab41070877e8edd6eda517002ba88315820b8a68ca1958664df0acd800e142 |
File details
Details for the file socio4health-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: socio4health-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e9a174fb9dd4096661f59ed7b8d07c2d1196e3de9afc2f5ef4d383d82bcb48e |
|
MD5 | edbd380e72265e07cbfcac195dbdfa99 |
|
BLAKE2b-256 | 56f8983dbb892fa885c11a737fd15e83df17dd9ec50266fe5a986f6738e8f517 |