A Python library for analyzing GPS data from fishing grounds.
Project description
Crab-Lib
Crab-Lib is a Python library designed to simplify the processing, analysis, and visualization of GPS data, particularly for fishing grounds in the Gulf of Mexico. It includes tools for cleaning data, calculating distances, detecting clusters, and generating insightful visualizations. Whether you're a fisheries scientist, data analyst, or GIS enthusiast, Crab-Lib helps you make sense of complex geospatial datasets.
Features
- Data I/O:
- Load and save GPS data in CSV format.
- Validate and handle missing or malformed data.
- Data Preprocessing:
- Remove duplicates and invalid rows.
- Filter data based on keywords in comments.
- Standardize comment text for consistency.
- Geospatial Analysis:
- Calculate great-circle distances using the Haversine formula.
- Identify clusters of GPS points based on proximity.
- Summarize datasets with averages and bounding boxes.
- Visualization:
- Plot points, clusters, and density heatmaps.
- Create interactive and informative visualizations.
- Utilities:
- Convert coordinates from DMS to decimal degrees.
- Calculate initial bearings between two GPS points.
- Validate if a point lies within a bounding box.
Table of Contents
- Installation
- Quick Start
- Detailed Examples
- API Reference
- Troubleshooting
- Contributing
- License
- Contact
Installation
Install via Source
- Clone the repository:
git clone https://github.com/RedGloveProductions/crab-lib.git cd crab-lib
Install the library: pip install -e . Dependencies Crab-Lib requires the following Python packages: matplotlib >= 3.0.0 For development, additional tools like pytest and black are recommended. Quick Start
Here's a quick guide to get started with Crab-Lib: Load Data: from crab_lib.io import load_csv data = load_csv("fishing_data.csv") Clean Data: from crab_lib.preprocessing import clean_data cleaned_data = clean_data(data) Analyze Data: from crab_lib.analysis import calculate_distances, find_clusters distances = calculate_distances(cleaned_data) clusters = find_clusters(cleaned_data, radius=100) Visualize Data: from crab_lib.visualization import plot_points plot_points(cleaned_data) Detailed Examples
Data I/O Load and Save Data: from crab_lib.io import load_csv, save_csv
Load data from a CSV file
data = load_csv("fishing_data.csv")
Save cleaned data to a new file
save_csv("cleaned_data.csv", data) Preprocessing Clean, Filter, and Standardize Data: from crab_lib.preprocessing import clean_data, filter_data, standardize_comments
Remove duplicates and invalid rows
cleaned_data = clean_data(data)
Filter rows containing the keyword 'hotspot'
filtered_data = filter_data(cleaned_data, "hotspot")
Standardize comments for consistency
standardized_data = standardize_comments(filtered_data) Analysis Calculate Distances and Find Clusters: from crab_lib.analysis import calculate_distances, find_clusters, summarize_data
Calculate pairwise distances
distances = calculate_distances(cleaned_data)
Identify clusters within a 50 km radius
clusters = find_clusters(cleaned_data, radius=50)
Summarize the dataset
summary = summarize_data(cleaned_data) print(summary) Visualization Visualize Data: from crab_lib.visualization import plot_points, plot_clusters, create_heatmap
Plot individual points
plot_points(cleaned_data)
Plot clusters
plot_clusters(clusters)
Create a heatmap of point density
create_heatmap(cleaned_data) Utilities Convert Coordinates and Calculate Bearings: from crab_lib.utils import convert_coordinates, calculate_bearing, is_within_bounds
Convert DMS to decimal degrees
decimal_coord = convert_coordinates("25°46'26.5"N")
Calculate bearing between two points
bearing = calculate_bearing((25.774, -80.19), (27.345, -82.567)) print("Bearing:", bearing)
Check if a point is within bounds
bounds = (25.0, 26.0, -81.0, -79.0) print("Is within bounds:", is_within_bounds((25.774, -80.19), bounds)) API Reference
I/O Module load_csv(file_path: str) -> List[Dict[str, str]]: Load GPS data from a CSV file. save_csv(file_path: str, data: List[Dict[str, str]]) -> None: Save data to a CSV file. Preprocessing Module clean_data(data: List[Dict[str, str]]) -> List[Dict[str, str]]: Remove duplicates and invalid rows. filter_data(data: List[Dict[str, str]], keyword: str) -> List[Dict[str, str]]: Filter rows by keyword. standardize_comments(data: List[Dict[str, str]]) -> List[Dict[str, str]]: Standardize comment text. Analysis Module calculate_distances(data: List[Dict[str, float]]) -> List[Dict[str, float]]: Compute pairwise distances. find_clusters(data: List[Dict[str, float]], radius: float) -> List[List[Dict[str, float]]]: Detect clusters. summarize_data(data: List[Dict[str, float]]) -> Dict[str, float]: Summarize dataset metrics. Visualization Module plot_points(data: List[Dict[str, float]]) -> None: Scatter plot of GPS points. plot_clusters(clusters: List[List[Dict[str, float]]]) -> None: Visualize clusters. create_heatmap(data: List[Dict[str, float]], bins: int) -> None: Generate a heatmap. Utils Module convert_coordinates(dms: str) -> float: Convert DMS to decimal degrees. calculate_bearing(coord1: Tuple[float, float], coord2: Tuple[float, float]) -> float: Compute initial bearing. is_within_bounds(coord: Tuple[float, float], bounds: Tuple[float, float, float, float]) -> bool: Check if a point is in bounds. Troubleshooting
Issue: FileNotFoundError when loading a CSV. Solution: Verify the file path and ensure the file exists. Issue: ValueError when cleaning or processing data. Solution: Ensure your data matches the required format (x, y, comment). Contributing
Contributions are welcome! To contribute: Fork the repository. Create a feature branch. Submit a pull request with a clear description of changes. License
This project is licensed under the MIT License. Contact
Author: Your Name Email: your_email@example.com GitHub: RedGloveProductions Project Links
GitHub Repository Issue Tracker
This README.md
includes the additional details and provides a comprehensive introduction to your library.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file crab_lib-1.0.0.tar.gz
.
File metadata
- Download URL: crab_lib-1.0.0.tar.gz
- Upload date:
- Size: 17.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35394ba3315d33898eea9b639129118233c672791e412151b7465d3fb68ef190 |
|
MD5 | 3ca6ae6f8bd0ffe94ba2841ab9f3b1b2 |
|
BLAKE2b-256 | 4ce4ad7e007459d66598ea0cbfc4693af439f88631c42653d114dbf4a5c216c4 |
File details
Details for the file crab_lib-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: crab_lib-1.0.0-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49045d2c83103c37fcc5a439c53b0460d282a7922714c1406dd5a2888758b2d9 |
|
MD5 | 396bd0a3103ff40b23f42a4e5a105e1c |
|
BLAKE2b-256 | 29c1caebda1576b17629fc704b198e49624b7c3c30c66389a0eb9db21c88f107 |