A python package for density adaptive DBSCAN clustering
Project description
AdaptiveDBSCAN
This is a normalized form of DBSCAN alogorithm that is based on varying number of neighbour. This algorithm is useful when your data has different density pattern. To get more information about the algorithm, please refer to the paper.
installation
For the best performance, it is recommended to create a new environment and then install the package:
conda create -n dadbscan python
To install the package, you can use pip:
pip install dadbscan
Getting Started
After installing the package, you can use it as follows by importing the modules:
from dadbscan.density import EQ_Density
from dadbscan.clustering import dbscan
------------------------------------------------------------------------------------
Phase1.
The first line is being used for creating a density map and the second one is for applying the Density-Adaptive DBSCAN algorithm. Now by defining the N value you have database as a CSV file, and you can run the density algorithm:
initiating the EQ_density class:
In the new version of Adaptive DBSCAN, we have added a filering option that can used as below"
filters = {"col":["num_picks", "Lat", "Lat", "Lon", "Lon"], #select column
"op":['>', '>=', '<=', ">=", "<="], # it is an string format like >, =, >=, !=
"val":[3, 55, 70, -105, -70] #it can be string or digits}
N = 65 # number of cells like NxN
density = EQ_Density(N, data_file='YOUR_FILE_PATH', min_year=1900, max_year=2050, min_mag=1, max_mag=9, filters=filters, map_extenion_value=0.1) #map_extension_value can used to extend the frame
! Remember to appropriately configure filters such as min_year, max_year, and others in your catalogue. Setting these values correctly according to your dataset is crucial; failing to do so may result in partial or complete filtering out of your catalogue.
To test the program, you can download the test file from the GitHub repo and use decl_cat.csv as a database.
YOUR_FILE_PATH = 'decl_cat.csv'
! It should be noted that your dataset must have a header like below (order is not important but it is case-sensitive): Lat, Lon, (Depth,Year, Month, Mw) ! If you have more columns in your dataset, you do NOT need to remove them. Lat and Lon are essentail and they are case sensitive.
running calc_density method:
heat_matrix = density.calc_density(minimum_density=10) #setting the background value by minimum_density
In the command above, by adding minimum_density = ..., you can define the threshold for the minimum value of the density for each cell. The default value is 10.
plotting the density map:
density.plot_density()
a feature that can be used is smoothing the density map. This can be done by using the following method:
smoothed_heat_matrix = density.cell_smoother(apply_smooth=True)
! All the matrixes are saved physically in the folder 'Results' in two formats, PNG and CSV.
------------------------------------------------------------------------------------
Phase2.
Now that you have the density map, you can run the Density-Adaptive DBSCAN algorithm. To do so, you need to define the following parameters:
radius = density.radius
density_file_name = "Results/den_decl_cat__65_smooth.csv"
! be careful to correctly name the density_file_name.
As can be seen above, the radius can be derived from the density class. now it is time to initiate the dbscan class and run the algorithm:
clustering = dbscan(radius, density_file_name)
Step below can take a few minutes to be done...
final = clustering.clustering()
clustering.plot_clusters()
If you have a shape file to plot it on the background, you can use it here.
clustering.plot_clusters(shape_file_address="data/ShapeFiles/World_Countries_Generalized.shp")
You can finally save the calculation results in a file by command below:
final.to_csv(f"Results/R__final.csv")
When plotting the clustered data, you have some options:
plot_clusters(self, **kwargs):
"""
**kwargs:
cmap_shp: str, default="grey"
The colormap to use for the shape file in the background
cmap_scatter: str, default="turbo"
The colormap to use for the scatter plot
shp_linewidth: float, default=2
The linewidth of the shape file
save_fig: bool, default=False
Whether to save the figure or not, if so, it will be saved in the ExampleData folder
save_fig_format: str, default="pdf"
The format to save the figure in
shape_file_address: str, default=False
The address of the shape file to plot in the background, you can use the World_Countries_Generalized.shp file in the ShapeFiles folder.
shape_file_address="ShapeFiles/World_Countries_Generalized.shp"
"""
Reference
Sina Sabermahani, Andrew W. Frederiksen (2023), Improved Earthquake Clustering Using a Density‐Adaptive DBSCAN Algorithm: An Example from Iran. Seismological Research Letters, doi: https://doi-org.uml.idm.oclc.org/10.1785/0220220305
License
This project is licensed under the MIT License - see the MIT License file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adaptive_dbscan-0.3.2.tar.gz.
File metadata
- Download URL: adaptive_dbscan-0.3.2.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b71a73f7f03d5786d7a89505f17ed76e703cbb1482fa8f37c14f9457b9ad19ad
|
|
| MD5 |
8e0641745a739de26b8376f25dd14c9e
|
|
| BLAKE2b-256 |
b325e1375c4a6e78ccc245d401099d4289258526f0c04bdf091cc50fc40e20f0
|
File details
Details for the file adaptive_dbscan-0.3.2-py3-none-any.whl.
File metadata
- Download URL: adaptive_dbscan-0.3.2-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e3f4a5f644020c35883d7f51ddad4f3b98d2780bfad03a7ef00aff161d62431
|
|
| MD5 |
8e2197aa1c6a41a30b04c5ae67fa459c
|
|
| BLAKE2b-256 |
f656215e6d8297973b559c322a2b7d876b402b03a383a5819d6f47550e51ea15
|