Skip to main content

Nyctibius is a Python package for gathering and consolidating socio-demographic data.

Project description

Nyctibius - Streamlining sociodemographic data harmonizing.

en es License: MIT R-CMD-check Codecov test coverage lifecycle-concept

The Python package Nyctibius is designed to streamline the complex task of gathering and consolidating sociodemographic data from various sources into a cohesive relational database. Nyctibius empowers users to effortlessly unify custom data sets from diverse socio-demographic sources, ensuring that they can work with up-to-date and comprehensive information in a seamless manner. This package facilitates the process of creating a harmonized repository of socio-demographic data, simplifying data management and analysis for users across various domains.

Features

  • Seamlessly retrieve data from online data sources through web scraping.
  • Effortlessly extract data from diverse sources, consolidating it into a cohesive relational database.
  • Conduct precise queries and apply transformations to meet specific criteria.
  • Effectively manage data inconsistencies and discrepancies for enhanced accuracy.
  • Support for various data formats, including .csv, .xlsx, .xls, .txt, and zip files, ensuring versatility in sourcing information.

Installation

For full documentation, please refer to the Nyctibius documentation.

You can install the Nyctibius package using pip. Make sure you have Python 3.x installed on your system; the package requires Python version 3.7 or higher.

pip install nyctibius

Usage

To use the Nyctibius package, follow these steps:

  1. Import the package in your Python script:

    from nyctibius import Harmonizer
    
  2. Create an instance of the Harmonizer class:

    harmonizer = Harmonizer()
    
  3. Extract data from online sources and create a list of data information:

    url = 'https://www.example.com'
    depth = 0
    ext = 'csv'
    list_datainfo = harmonizer.extract(url=url, depth=depth, ext=ext)
    harmonizer = Harmonizer(list_datainfo)
    
  4. Load the data from the list of data information and merge it into a relational database:

    results = harmonizer.load()
    
  5. Import the modifier module and create an instance of the Modifier class:

    from nyctibius.db.modifier import Modifier
    modifier = Modifier(db_path='../../data/output/nyctibius.db')
    
  6. Perfom modifications:

    tables = modifier.get_tables()
    print(tables)
    
  7. Import the querier module and create an instance of the Querier class:

    from nyctibius.db.querier import Querier
    querier = Querier(db_path='data/output/nyctibius.db')
    
  8. Perform queries:

    df = querier.select(table="Estructura CHC_2017").execute()
    print(df)
    

Supported Data Sources

The package supports the following sources:

  • Colombian microdata links from National Administrative Department of Statistics (DANE)
  • Local files
  • Open data sources

Please note that accessing data from these organizations may require authentication or specific credentials. Make sure you have the necessary permissions before using the library.

License

The Nyctibius package is open-source and released under the MIT License. Feel free to use, modify, and distribute this library in accordance with the terms of the license.

Acknowledgements

We would like to thank the following entities for providing the data used and the economic financial support for the development of this package:

  • National Administrative Department of Statistics (DANE)
  • Barcelona Supercomputing Center (BSC)
  • Universidad de los Andes

Contact

For any questions, suggestions, or feedback regarding the package please contact:

Erick lozano, Email: es.lozano@uniandes.edu.co

Diego Irreño, Email: dirreno@unal.edu.co

Disclaimer

This library is not officially affiliated with or endorsed by any of the mentioned official organizations. The data provided by this library is sourced from publicly available information and may not always reflect the most current or accurate data. Please verify the information with the respective official sources for critical use cases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nyctibius-0.0.7.tar.gz (24.9 kB view details)

Uploaded Source

Built Distribution

nyctibius-0.0.7-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file nyctibius-0.0.7.tar.gz.

File metadata

  • Download URL: nyctibius-0.0.7.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for nyctibius-0.0.7.tar.gz
Algorithm Hash digest
SHA256 5b5981ee8e1e95b910d41251107b408f382df2ae2d32db4e52ece2cbc6b671cd
MD5 97b10df95e1bdb4b42eda3b204416ca5
BLAKE2b-256 81d6292476bd37ee3e0ec6676f93ff4d04ed95b3cb17c9255310f5f85ba02efe

See more details on using hashes here.

File details

Details for the file nyctibius-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: nyctibius-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 35.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for nyctibius-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 dcc4d8749b562502d7a9a2a2cb0e29e8fdd51b864ecb22abe7c6fd0d8897d99e
MD5 762c49bc119f0c842872f53796e14687
BLAKE2b-256 1813aae5f5bcb6673e8f27f4ad1fe5c621d3f0fbcf495ec65f61bd11a6d2179d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page