Socio4health is a Python package for gathering and consolidating socio-demographic data.
Project description
socio4health
Overview
Package socio4health is an extraction, transformation and loading (ETL) classification tool designed to simplify the intricate process of collecting and merging data from multiple sources, focusing on sociodemographic and census datasets from Colombia, Brazil, and Peru, into a harmonized dataset.
- Seamlessly retrieve data from online data sources through web scraping, as well as from local files.
- Support for various data formats, including
.csv,.xlsx,.xls,.txt,.sav, fixed-width files and geospatial files, ensuring versatility in sourcing information. - Consolidating extracted data into a pandas (or dask) DataFrame.
Dependencies
Installation
socio4health can be installed via pip from PyPI.
# Install using pip
pip install socio4health
How to Use it
To use the socio4health package, follow these steps:
-
Import the package in your Python script:
from socio4health import Extractor() from socio4health import Harmonizer
-
Create an instance of the
Extractorclass:extractor = Extractor()
-
Extract data from online sources and create a list of data information:
url = 'https://www.example.com' depth = 0 ext = 'csv' list_datainfo = extractor.s4h_extract(url=url, depth=depth, ext=ext) harmonizer = Harmonizer()
For more detailed examples and use cases, please refer to the socio4health documentation.
Resources
Package Website
The socio4health website package website includes API reference, user guide, and examples. The site mainly concerns the release version, but you can also find documentation for the latest development version.
Organisation Website
Harmonize is an international project that develops cost-effective and reproducible digital tools for stakeholders in Latin America and the Caribbean (LAC) affected by a changing climate. These stakeholders include cities, small islands, highlands, and the Amazon rainforest.
The project consists of resources and tools developed in conjunction with different teams from Brazil, Colombia, Dominican Republic, Peru, and Spain.
Organizations
|
|
|
Authors / Contact information
Here is the contact information of authors/contributors in case users have questions or feedback.
Diego Irreño (developer)
Erick Lozano (developer)
Juan Montenegro (developer)
Ingrid Mora (documentation)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file socio4health-1.0.7.tar.gz.
File metadata
- Download URL: socio4health-1.0.7.tar.gz
- Upload date:
- Size: 38.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d39127e754905865826a542c0a7ff293f3af153301a281a0d4ce8199f1e79e88
|
|
| MD5 |
406c2d54c944d5f796b3951664ad1a20
|
|
| BLAKE2b-256 |
6064583f929f66d7ab2afdd84105f760445b4b92b511347095523d317f1a6018
|
File details
Details for the file socio4health-1.0.7-py3-none-any.whl.
File metadata
- Download URL: socio4health-1.0.7-py3-none-any.whl
- Upload date:
- Size: 31.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d0ee915f6eb4972ed86b50ba3d018e2bdab7f2851b760e7a17098d1e4249480
|
|
| MD5 |
61d66914f89b69866825cfbb9ef4a6ad
|
|
| BLAKE2b-256 |
21e29bb009f49825eb3ba6d2fd74a191efa6a7959cb9aeab7bc6e286e9aa33e0
|