Skip to main content

A Python package for generating synthetic river networks and datasets

Project description

FluvialGen

A Python package for generating synthetic river networks and datasets.

Installation

You can install FluvialGen using pip:

pip install fluvialgen

Or install from source:

git clone https://github.com/joseenriqueruiznavarro/FluvialGen.git
cd FluvialGen
pip install -e .

Requirements

  • Python >= 3.8
  • NumPy
  • Pandas
  • SciPy
  • Matplotlib
  • GeoPandas
  • Shapely
  • Rasterio
  • tqdm

Integration with River Models

MovingWindowBatcher

This class provides a way to process data in overlapping windows with batching support:

from river import compose, linear_model, preprocessing, optim, metrics
from fluvialgen.movingwindow_generator import MovingWindowBatcher
from river import datasets

# Create a River pipeline
model = compose.Select('clouds', 'humidity', 'pressure', 'temperature', 'wind')
model |= preprocessing.StandardScaler()
model |= linear_model.LinearRegression(optimizer=optim.SGD(0.001))

# Initialize metrics
metric = metrics.MAE()

# Create the dataset and batcher
dataset = datasets.Bikes()
batcher = MovingWindowBatcher(
    dataset=dataset,
    instance_size=2,  # Size of each window
    batch_size=2,     # Number of instances per batch
    n_instances=1000
)

# Train the model
try:
    # Process batches and train the model
    for X, y in batcher:
        # Train on each instance in the batch
        for i in range(len(X)):
            x = X.iloc[i]
            target = y.iloc[i]
            model.learn_one(x, target)
            
        # Make predictions and update metrics
        for i in range(len(X)):
            x = X.iloc[i]
            target = y.iloc[i]
            y_pred = model.predict_one(x)
            metric.update(target, y_pred)
            
    print(f"Final MAE: {metric}")

finally:
    # Clean up
    batcher.stop()

PastForecastBatcher

This class provides a way to process data with past and forecast windows:

from river import compose, linear_model, preprocessing, optim, metrics
from fluvialgen.past_forecast_batcher import PastForecastBatcher
from river import datasets

# Create a River pipeline
model = compose.Select('clouds', 'humidity', 'pressure', 'temperature', 'wind')
model |= preprocessing.StandardScaler()
model |= linear_model.LinearRegression(optimizer=optim.SGD(0.001))

# Initialize metrics
metric = metrics.MAE()

# Create the dataset and batcher
dataset = datasets.Bikes()
batcher = PastForecastBatcher(
    dataset=dataset,
    past_size=3,      # Number of past instances to include
    forecast_size=1,  # Number of future instances to include
    n_instances=1000
)

# Train the model
try:
    # Process instances and train the model
    for X_past, y_past, current_x, current_y in batcher:
        # Train on past data
        for i in range(len(X_past)):
            x = X_past.iloc[i]
            target = y_past.iloc[i]
            model.learn_one(x, target)
            
        # Make prediction for current moment
        y_pred = model.predict_one(current_x)
        metric.update(current_y, y_pred)
            
    print(f"Final MAE: {metric}")

finally:
    # Clean up
    batcher.stop()

Data Structure

MovingWindowBatcher

For each batch, MovingWindowBatcher returns:

  • X: DataFrame with all instances in the batch
  • y: Series with all targets in the batch

For example, with instance_size=2 and batch_size=2:

  • First batch:
    • X = DataFrame with [x1,x2,x2,x3]
    • y = Series with [y1,y2,y2,y3]
  • Second batch:
    • X = DataFrame with [x2,x3,x3,x4]
    • y = Series with [y2,y3,y3,y4]

PastForecastBatcher

For each instance, PastForecastBatcher returns:

  • X_past: DataFrame with past data
  • y_past: Series with past targets
  • current_x: Dictionary with current moment data
  • current_y: Float with current moment target

For example, with past_size=3 and forecast_size=1:

  • First instance:
    • X_past = DataFrame with [x1,x2,x3]
    • y_past = Series with [y1,y2,y3]
    • current_x = x5
    • current_y = y5
  • Second instance:
    • X_past = DataFrame with [x2,x3,x4]
    • y_past = Series with [y2,y3,y4]
    • current_x = x6
    • current_y = y6

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fluvialgen-0.5.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fluvialgen-0.5.0-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file fluvialgen-0.5.0.tar.gz.

File metadata

  • Download URL: fluvialgen-0.5.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for fluvialgen-0.5.0.tar.gz
Algorithm Hash digest
SHA256 00b7cf6455e186441a2a481c605d2677a79d804778fbbe3d2d4dddd0cb60e2ba
MD5 344e12b70c87dd308b5f044a9d316cf3
BLAKE2b-256 22a0e10a59d86aae358aa9f5cf19cca553a16b5971a07ebf888591eba7cbff91

See more details on using hashes here.

File details

Details for the file fluvialgen-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: fluvialgen-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for fluvialgen-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1d4ba5b04d00e6fd66749d60d3ab76852cd5332e837bfa5e318fbd6edcaf7785
MD5 ad3f649ba5c9f582cddd876bcaf7d136
BLAKE2b-256 ef9bafc76151221f289cb476b3203adc52594995b457bd05aa6100c8e809745c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page