Commands to download datasets from the Open Data portal of the Kingdom of Saudi Arabia (KSA).
Project description
Open Data - Kingdom of Saudi Arabia
This repository contains scripts to download datasets from the Open Data portal of the Kingdom of Saudi Arabia (KSA). The main script downloads all datasets for a given organization ID and saves them locally.
Directory Structure
.
├── README.md
├── download_all_org.py
├── opendata (optional - parameter in file)
├── requirements.txt
├── open_ksa
│ ├── download_file.py
│ ├── organizations.py
│ ├── get_dataset_resources.py
│ └── get_org_resources.py
└── system-drawing.excalidraw
Functions Overview
organizations()
: Get the organization information, including the option to write to a target file as a CSV or JSON.get_org_resources(org_id)
: Retrieves the organization name, organization ID, and dataset IDs for the specified organization.get_dataset_resources(dataset_ids, allowed_exts=['csv', 'xlsx', 'xls'], output_dir='opendata/org_resources', verbose=False)
: Downloads all data resources for the specified dataset IDs.download_file(session, url, headers, file_path)
: Downloads a file from the specified URL using the provided session and headers.
Process Flow
graph TD
subgraph Initialization
A[Start] --> B[Create SSL Adapter]
B --> C[Setup Session]
end
subgraph Resource Extraction
C --> D[Extract Organization ID]
D --> E[Extract Dataset IDs]
end
subgraph Directory Setup
E --> F[Create Directory for Organization]
end
subgraph Data Download
F --> G[Download Dataset Resources]
G --> H[Save Data Locally]
end
H --> I[End]
Usage
To run the script with the dependencies, first install the virtualenv
:
python -m venv venv
pip install -r requirements.txt
Then you'll be able to run the python primary python script successfully:
python download_all_org.py
NOTE: For a different organization, you need to update the parameter in the file for the org_id
parameter in the function
Release Plan / To DO
- Create a set of functions to cover the entire API, including:
- Create a function to get the list of organizations
- Create a function to get the list of datasets for an organization
- Create a function to get the list of resources for a dataset
- Create a function to download a resource
- Create a function to check the status of a download
- Create a set of unit tests for the functions
- Create a set of examples for the functions
- Create a set of documentation for the functions
- Move the repository to a PyPi library
Contribution
The contribution process is as follows:
- Clone the repository and create a new branch
- Make your changes, following the coding style guidelines
- Create a pull request with a detailed description of your changes
- Wait for your pull request to be reviewed and approved
- Once approved, your changes will be merged and available in the main branch
When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change.
Please note we have a code of conduct, please follow it in all your interactions with the project.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file open_ksa-0.1.1.dev2.tar.gz
.
File metadata
- Download URL: open_ksa-0.1.1.dev2.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5804204ba68cfc21b5209634ce5f6e561d251f78d86b313446707b7a6ea2956 |
|
MD5 | 03c85c1e5cf4628f7a2c072ec666b0e6 |
|
BLAKE2b-256 | 683c51cbb36714c240a01d505f5d91ae416304959abc3fbfcc7bd040d5a34345 |
File details
Details for the file open_ksa-0.1.1.dev2-py3-none-any.whl
.
File metadata
- Download URL: open_ksa-0.1.1.dev2-py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c37efb73765f479f8e832cacce5d58ba970fbdb0f77da8b8d2babec933c6fb9 |
|
MD5 | 2ec1f41afc4459ef8e8eeaa8dc399801 |
|
BLAKE2b-256 | 04afa554b6677d96c461cde8127727d87d702bc5ecd9f0bb316c296e491e5e1f |