Algerian Forest Fire Prediction Model
Project description
Algerian Forest Fire Prediction
This repository contains a machine learning project focused on predicting forest fire occurrences in Algeria using the "Algerian Forest Fires Dataset" from Kaggle. The project aims to develop a robust model that can effectively identify potential fire risks, thereby supporting proactive measures for prevention, mitigation, and resource allocation.
๐ฉ Table of Contents
Project Overview
Forest fires pose a significant threat to the environment and human safety. This project aims to develop a a machine learning model that can accurately predict whether a forest fire will occur based on input features based on environmental and weather data. This is a binary classification problem, where the model needs to learn the patterns that distinguish between instances where a fire occurred ("fire") and instances where no fire occurred ("not fire").
Project Goals
- Data Acquisition and Preprocessing:
- Download and prepare the "Algerian Forest Fires Dataset" for analysis.
- Cleanse the data to handle missing values, inconsistencies, and outliers.
- Model Development:
- Train a machine learning model capable of predicting whether a forest fire will occur based on environmental and weather factors.
- Explore and compare different machine learning algorithms to identify the most suitable model.
- Tune hyperparameters to optimize the model's performance.
- Model Evaluation:
- Evaluate the trained model using relevant metrics (e.g., accuracy, precision, recall, F1-score, ROC AUC).
- Analyze the model's predictions and identify any potential areas for improvement.
- Pipeline Creation:
- Develop a streamlined pipeline to automate the entire process, from data ingestion to model training and evaluation, ensuring reproducibility.
- Deployment (Future Consideration):
- Explore potential deployment options to make the model readily available for operational use, such as a web application, API, or integration with existing fire management systems.
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Prerequisites
- Python 3.9 (or compatible)
- Conda (optional, but recommended)
Installation
-
Clone the repository:
$ git clone https://github.com/your-username/algerian-forest-fire-prediction.git
-
Create a conda environment (optional):
$ conda create -n forest-fire-env python=3.9
This creates a Conda environment named
forest-fire-env
(you can choose your own name) with Python 3.9. Adjust the Python version as needed. -
Activate the environment:
$ conda activate forest-fire-env
-
Install the required dependencies:
pip install -r requirements.txt
-
Setup your command line interface for better readability (Optional):
export PS1="\[\033[01;32m\]\u@\h:\w\n\[\033[00m\]\$ "
Running the Project
-
Setup Kaggle API:
- Download
kaggle.json
: Download your Kaggle API credentials (username and API key) from your Kaggle account. put the file in the project direactory.
{ "username": "your_username", "key": "your_api_key" }
- Configure Kaggle: Run these commands in your terminal to copy the
kaggle.json
file in a spacific directory.mkdir -p ~/.kaggle cp kaggle.json ~/.kaggle/kaggle.json
- Set Permissions: Make sure the file is only accessible to you:
chmod 600 ~/.kaggle/kaggle.json
- Download
-
Configure DVC with Remote Storage:
- Create a Google Drive folder: Go to your Google Drive and create a new folder in your Google Drive to store your project's data and model artifacts (e.g., "Algerian Forest Fire Project").
- Obtain Drive Key: Go to the newly created Google Drive folder and get the folder's unique key from the URL (the part after
id=
in the URL). - Set up DVC remote:
dvc remote add -d gdrive dvc://?token=<your_drive_key> dvc remote default gdrive
Replace<your_drive_key>
with the Google Drive folder key you obtained in the previous step.
-
Run the project:
$ python template.py $ dvc repro
These commands will execute the project's pipeline, including:
- Data Ingestion: Download the dataset directly from Kaggle using the Kaggle API you just configured.
- Preprocessing: Clean, transform, and prepare the data for model training.
- Model Training: Train a machine learning model based on the chosen algorithm and hyperparameters.
- Model Evaluation: Evaluate the trained model's performance using various metrics.
- Artifact Saving: Save the trained model, evaluation results, and other important artifacts for future use or analysis.
This will guide you through the entire workflow from setting up your Kaggle API to running the project and generating valuable results.
Dataset
This dataset contains information about forest fires in Algeria, focusing on two specific regions:
- Bejaia region: Located in the northeast of Algeria.
- Sidi Bel-abbes region: Located in the northwest of Algeria.
The dataset includes data collected between June 2012 and September 2012.
Key Features:
- Instances: 244 (122 for each region)
- Attributes: 11 attributes (features)
- Output Attribute: 1 output attribute (class)
- Classes:
- Fire: 138 instances
- Not fire: 106 instances
Attributes:
- Date: The date of the observation (DD/MM/YYYY).
- Temp: Temperature in Celsius.
- RH: Relative Humidity in percentage.
- Ws: Wind speed in km/h.
- Rain: Total amount of rainfall in mm.
- FFMC: Fine Fuel Moisture Code, representing the moisture content of fine fuels (0-100).
- DMC: Duff Moisture Code, representing the moisture content of decaying organic matter (0-100).
- DC: Drought Code, representing the overall drought level (0-100).
- ISI: Initial Spread Index, representing the ease of fire ignition (0-100).
- BUI: Buildup Index, representing the total amount of fuel available (0-100).
- FWI: Fire Weather Index, representing the overall fire danger (0-100).
- Classes: The output class, indicating whether a fire occurred (1) or not (0).
Attributes Description:
Feature | Description | Data Type |
---|---|---|
Classes | Fire or not fire (target variable) | Categorical |
month | Month of the year (1-12) | Integer |
RH | Relative humidity (%) | Integer |
Temperature | Temperature (Celsius) | Integer |
Ws | Wind speed (km/h) | Integer |
year | Year of the observation | Integer |
DC | Drought Code Index | Float |
Rain | Total amount of precipitation (mm) | Float |
DMC | Drought Code Index | Float |
FFMC | Fine Fuel Moisture Code | Float |
BUI | Buildup Index | Float |
ISI | Initial Spread Index | Float |
FWI | Fire Weather Index | Float |
day | Day of the month (1-31) | Integer |
Project Structure
.
โโโ Notebooks
โ โโโ EDA_and_Feature_Engineering.ipynb
โโโ data
โ โโโ processed
โ โ โโโ X_test.npy
โ โ โโโ X_train.npy
โ โ โโโ cleaned_algerian_forest_fires_dataset.csv
โ โ โโโ y_test.npy
โ โ โโโ y_train.npy
โ โโโ raw
โ โโโ Algerian_forest_fires_dataset.csv
โโโ src
โ โโโ __init__.py
โ โโโ components
โ โ โโโ __init__.py
โ โ โโโ data_factory.py
โ โ โโโ data_ingestion.py
โ โ โโโ model_training.py
โ โโโ exception.py
โ โโโ logger.py
โ โโโ pipeline
โ โ โโโ __init__.py
โ โ โโโ inference_pipeline.py
โ โ โโโ train_pipeline.py
โ โโโ utils.py
โโโ tests
| โโโ __init__.py
| โโโ integration
| โ โโโ __init__.py
| โ โโโ init_test.py
| โโโ unit
| โโโ __init__.py
| โโโ unit_test.py
โโโ template.py
โโโ dvc.lock
โโโ dvc.yaml
โโโ params.yaml
โโโ requirements.txt
โโโ setup.py
โโโ README.md
Tools
Contributions
Contributions to this project are welcome! Feel free to:
- Report issues: If you encounter any bugs or have suggestions for improvement, please open an issue on the GitHub repository.
- Submit pull requests: If you'd like to contribute code, fork the repository, make your changes, and submit a pull request.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file forest_fire-0.1.1.tar.gz
.
File metadata
- Download URL: forest_fire-0.1.1.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b3b789a9c568206e3158b87c22015ec255ece6c44122b6882c4c6d82583d72a |
|
MD5 | 3344fd066d5ad3c9c2588030daab54ec |
|
BLAKE2b-256 | a96741b2e71900ce82a1a661653350ab856dc5da13d6311f2e766e700f4cad0a |
File details
Details for the file forest_fire-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: forest_fire-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86a25a241bccefccc9003b0e67609f71af88cf3f6d0b2bc9c37197535cb60ca8 |
|
MD5 | 2094868e8b109338f2c79433395928dc |
|
BLAKE2b-256 | 2261453ed8fed9adc5b90451941531c58e48e9b8282952f830c328761099ffcd |