A Python package for generating synthetic river networks and datasets
Project description
FluvialGen
A Python package for generating synthetic river networks and datasets.
Installation
You can install FluvialGen using pip:
pip install fluvialgen
Or install from source:
git clone https://github.com/joseenriqueruiznavarro/FluvialGen.git
cd FluvialGen
pip install -e .
Requirements
- Python >= 3.8
- NumPy
- Pandas
- SciPy
- Matplotlib
- GeoPandas
- Shapely
- Rasterio
- tqdm
Integration with River Models
MovingWindowBatcher
This class provides a way to process data in overlapping windows with batching support:
from river import compose, linear_model, preprocessing, optim, metrics
from fluvialgen.movingwindow_generator import MovingWindowBatcher
from river import datasets
# Create a River pipeline
model = compose.Select('clouds', 'humidity', 'pressure', 'temperature', 'wind')
model |= preprocessing.StandardScaler()
model |= linear_model.LinearRegression(optimizer=optim.SGD(0.001))
# Initialize metrics
metric = metrics.MAE()
# Create the dataset and batcher
dataset = datasets.Bikes()
batcher = MovingWindowBatcher(
dataset=dataset,
instance_size=2, # Size of each window
batch_size=2, # Number of instances per batch
n_instances=1000
)
# Train the model
try:
# Process batches and train the model
for X, y in batcher:
# Train on each instance in the batch
for i in range(len(X)):
x = X.iloc[i]
target = y.iloc[i]
model.learn_one(x, target)
# Make predictions and update metrics
for i in range(len(X)):
x = X.iloc[i]
target = y.iloc[i]
y_pred = model.predict_one(x)
metric.update(target, y_pred)
print(f"Final MAE: {metric}")
finally:
# Clean up
batcher.stop()
Using CSV files
You can use CSV files directly with the provided CSV generators, which inherit from the same Base as other generators.
CSVDatasetGenerator
Basic CSV streaming with timing control:
from fluvialgen import CSVDatasetGenerator
# Stream data from CSV
csv_stream = CSVDatasetGenerator(
filepath="data.csv",
target_column="value",
feature_columns=["moment", "c1", "c2"],
parse_dates=["moment"],
stream_period=1000, # 1 second between elements
n_instances=1000
)
try:
for x, y in csv_stream:
print(f"Features: {x}, Target: {y}")
finally:
csv_stream.stop()
CSVPastForecastBatcher
Create past-forecast windows directly from CSV:
from fluvialgen import CSVPastForecastBatcher
# Create past-forecast batches from CSV
batcher = CSVPastForecastBatcher(
filepath="data.csv",
target_column="value",
feature_columns=["moment", "c1", "c2"],
parse_dates=["moment"],
past_size=3, # Number of past instances to include
forecast_size=1, # Use data 1 position ahead of past window
n_instances=1000
)
try:
for X_past, y_forecast in batcher:
print(f"Past data shape: {X_past.shape}")
print(f"Forecast value: {y_forecast}")
finally:
batcher.stop()
PastForecastBatcher
This class provides a way to process data with past data and forecast values:
from river import compose, linear_model, preprocessing, optim, metrics
from fluvialgen.past_forecast_batcher import PastForecastBatcher
from river import datasets
# Create a River pipeline
model = compose.Select('clouds', 'humidity', 'pressure', 'temperature', 'wind')
model |= preprocessing.StandardScaler()
model |= linear_model.LinearRegression(optimizer=optim.SGD(0.001))
# Initialize metrics
metric = metrics.MAE()
# Create the dataset and batcher
dataset = datasets.Bikes()
batcher = PastForecastBatcher(
dataset=dataset,
past_size=3, # Number of past instances to include
forecast_size=1, # Use data 1 position ahead of past window
n_instances=1000
)
# Train the model
try:
# Process instances and train the model
for X_past, y_forecast in batcher:
# Train on past data
for i in range(len(X_past)):
x = X_past.iloc[i]
# Note: y_forecast is a single value, not a Series
# You would need to use your own past y values or another data source
# Make prediction for the forecast position
forecast_features = X_past.iloc[-1] # Use last feature vector for prediction
y_pred = model.predict_one(forecast_features)
metric.update(y_forecast, y_pred)
print(f"Final MAE: {metric}")
finally:
# Clean up
batcher.stop()
Data Structure
MovingWindowBatcher
For each batch, MovingWindowBatcher returns:
X: DataFrame with all instances in the batchy: Series with all targets in the batch
For example, with instance_size=2 and batch_size=2:
- First batch:
X= DataFrame with [x1,x2,x2,x3]y= Series with [y1,y2,y2,y3]
- Second batch:
X= DataFrame with [x2,x3,x3,x4]y= Series with [y2,y3,y3,y4]
PastForecastBatcher & CSVPastForecastBatcher
For each instance, both PastForecastBatcher and CSVPastForecastBatcher return:
X_past: DataFrame with past feature datay_forecast: Single value representing the target at the forecast position
For example, with past_size=3 and forecast_size=0:
- First instance:
X_past= DataFrame with [x1,x2,x3]y_forecast= y4 (value at past_size + forecast_size position)
- Second instance:
X_past= DataFrame with [x2,x3,x4]y_forecast= y5 (value at past_size + forecast_size position)
With past_size=3 and forecast_size=1:
- First instance:
X_past= DataFrame with [x1,x2,x3]y_forecast= y5 (value at past_size + forecast_size position)
- Second instance:
X_past= DataFrame with [x2,x3,x4]y_forecast= y6 (value at past_size + forecast_size position)
CSVDatasetGenerator
For each element, CSVDatasetGenerator returns:
x: Dictionary with feature valuesy: Single target value
The generator automatically handles:
- CSV parsing with pandas
- Date parsing for specified columns
- Feature selection (automatic or manual)
- Timing control via BaseGenerator
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fluvialgen-1.2.0.tar.gz.
File metadata
- Download URL: fluvialgen-1.2.0.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f188ae8d8576800c82ee6b306af466b0a7243cf5a0fc8c84afbd79bfccb2f32a
|
|
| MD5 |
dceb78ff795602fc99b8b5b2ccc75847
|
|
| BLAKE2b-256 |
12e13580472a3b6ac63bc64dda184844d633f86ee503a1c0171a0bd66c17724a
|
File details
Details for the file fluvialgen-1.2.0-py3-none-any.whl.
File metadata
- Download URL: fluvialgen-1.2.0-py3-none-any.whl
- Upload date:
- Size: 17.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
233f5b2b7d5aaff39b20ef617c4da3f7b8099491ae60b1a69185ecfa0bad0a6f
|
|
| MD5 |
c301f1e7574a00284f10f4f0169fa0e4
|
|
| BLAKE2b-256 |
24f5ba06aee3e057e8e8dd5374aa1ba59cecaf27887d80fc127c13764cde7437
|