TransLink's Python Tools - A comprehensive toolkit for transportation modeling and forecasting
Project description
TLPyTools
TransLink's Python Tools - A comprehensive toolkit for transportation modeling and forecasting developed by the TransLink Forecasting Team.
Overview
TLPyTools provides a suite of utilities and tools designed to support various aspects of transportation modeling workflows, from data management and processing to cloud synchronization and model orchestration. Built specifically for the TransLink Forecasting Team's modeling needs.
Installation
Note: TLPyTools requires Python 3.10 or higher. ActivitySim and PopulationSim are available for development installations using uv sync.
Using uv (Recommended)
TLPyTools uses uv for fast and reliable dependency management:
# Install uv if you haven't already
pip install uv
# Clone and install the package
git clone https://github.com/TransLinkForecasting/tlpytools.git
cd tlpytools
# Install core package only
uv sync
# Install with ORCA orchestrator support
uv sync --extra orca
# Install with full development environment (includes GIS tools, visualization, etc.)
uv sync --extra dev
# Install multiple extras (common combinations)
uv sync --extra dev --extra orca
# For development with ActivitySim and PopulationSim (git dependencies available via uv)
# Note: ActivitySim/PopulationSim are only available for development, not through PyPI
uv sync --extra dev --extra orca --group activitysim
Using pip (Alternative)
You can still use pip for installation, but ActivitySim and PopulationSim are not available as extras due to PyPI restrictions on git dependencies:
# Core package only
pip install tlpytools
# With ORCA orchestrator support
pip install tlpytools[orca]
# With full development environment
pip install tlpytools[dev]
# Multiple extras
pip install tlpytools[dev,orca]
# Development installation (without ActivitySim - use uv for that)
git clone https://github.com/TransLinkForecasting/tlpytools.git
cd tlpytools
pip install -e .[dev,orca]
Core Modules
Data Management (tlpytools.data)
Utilities for data processing and manipulation:
- DataFrame Operations: Enhanced pandas functionality for transportation data
- Spatial Data Support: Optional GIS operations (requires
geopandas) - Data Validation: Tools for checking data integrity and consistency
from tlpytools.data import read_spatial_data, validate_dataframe
# Load spatial data (if geopandas available)
gdf = read_spatial_data("zones.shp")
# Validate data structure
is_valid = validate_dataframe(df, required_columns=['zone_id', 'households'])
Data Storage (tlpytools.data_store)
Comprehensive data storage and retrieval functionality:
- Multiple Backends: Support for various storage formats
- Metadata Management: Automatic tracking of data lineage
- Version Control: Built-in data versioning capabilities
from tlpytools.data_store import DataStore
store = DataStore("my_project")
store.save_data(df, "travel_times", metadata={"source": "model_run_1"})
retrieved_df = store.load_data("travel_times")
SQL Server Integration (tlpytools.sql_server)
Tools for working with SQL Server databases:
- Connection Management: Simplified database connections
- Query Utilities: Helper functions for common operations
- Bulk Operations: Efficient data loading and extraction
from tlpytools.sql_server import SQLServerConnection
with SQLServerConnection("server_name", "database_name") as conn:
df = conn.query("SELECT * FROM travel_data WHERE year = 2023")
conn.bulk_insert(new_data, "staging_table")
Cloud Storage (tlpytools.adls_server)
Azure Data Lake Storage integration:
- File Synchronization: Upload/download with conflict resolution
- Batch Operations: Efficient handling of large datasets
- Authentication: Secure connection management
from tlpytools.adls_server import adls_util
# Upload files to cloud storage
adls_util.upload_files(local_path="data/", remote_path="project/data/")
# Download with pattern matching
adls_util.download_files(remote_pattern="outputs/*.csv", local_path="results/")
Configuration Management (tlpytools.config)
Centralized configuration handling:
- YAML Support: Human-readable configuration files
- Environment Variables: Runtime configuration overrides
- Validation: Schema validation for configuration files
from tlpytools.config import load_config, validate_config
config = load_config("model_config.yaml")
if validate_config(config, schema="model_schema.json"):
print("Configuration is valid")
Logging (tlpytools.log)
Enhanced logging capabilities:
- Structured Logging: Consistent log formatting across projects
- Multiple Outputs: Console and file logging with different levels
- Performance Tracking: Built-in timing and profiling support
from tlpytools.log import setup_logger
logger = setup_logger("my_model", log_file="model.log")
logger.info("Starting model run")
logger.performance("Model completed", execution_time=120.5)
ORCA Model Orchestration
TLPyTools includes the ORCA (Orchestrated Regional Comprehensive Analysis) transportation model orchestrator as an optional component.
Quick Start with ORCA
# Install with ORCA support
pip install tlpytools[orca]
# Initialize a new model databank
python -m tlpytools.orca --action initialize_databank --databank db_example
# Run the complete model workflow
python -m tlpytools.orca --action run_models --databank db_example
ORCA Features
- Multi-Model Coordination: Orchestrates ActivitySim, commercial vehicle models, and traffic assignment
- Cloud Integration: Automatic synchronization with Azure Data Lake Storage
- State Management: Resume interrupted model runs from any point
- Configurable Workflows: YAML-based configuration for complex modeling pipelines
For detailed ORCA documentation, see README_ORCA.md.
Key Features
🔧 Modular Design
- Independent modules with optional dependencies
- Use only what you need without heavy dependency chains
- Clear separation of concerns for easier maintenance
🚀 Performance Optimized
- Efficient data processing with pandas and NumPy
- Chunked operations for large datasets
- Optional performance monitoring and profiling
☁️ Cloud Ready
- Native Azure integration for data storage and processing
- Secure authentication and connection management
- Efficient file synchronization with conflict resolution
🔄 Production Ready
- Comprehensive error handling and logging
- State management for long-running processes
- Configurable retry logic and timeout handling
🧪 Testing Support
- Built-in validation utilities
- Mock objects for testing cloud operations
- Comprehensive test suite included
Usage Examples
Basic Data Pipeline
from tlpytools.data import process_survey_data
from tlpytools.data_store import DataStore
from tlpytools.log import setup_logger
# Setup logging
logger = setup_logger("data_pipeline")
# Process survey data
processed_data = process_survey_data("survey_2023.csv")
logger.info(f"Processed {len(processed_data)} survey records")
# Store results
store = DataStore("survey_analysis")
store.save_data(processed_data, "processed_survey_2023")
Cloud Synchronization Workflow
from tlpytools.adls_server import adls_util
from tlpytools.config import load_config
# Load cloud configuration
config = load_config("cloud_config.yaml")
# Sync local results to cloud
adls_util.upload_directory(
local_path="model_outputs/",
remote_path=f"projects/{config['project_name']}/outputs/",
conflict_resolution="timestamp"
)
SQL Server Integration
from tlpytools.sql_server import SQLServerConnection
from tlpytools.data import validate_dataframe
# Connect and validate data
with SQLServerConnection("prod_server", "transport_db") as conn:
# Load reference data
zones = conn.query("SELECT * FROM zones WHERE active = 1")
# Validate structure
if validate_dataframe(zones, required_columns=['zone_id', 'area_type']):
print(f"Loaded {len(zones)} valid zones")
Dependencies
Core Dependencies
pandas>=1.1- Data manipulation and analysisnumpy>=1.18- Numerical computingsqlalchemy>=1.4- SQL toolkit and ORMpyodbc>=4.0- SQL Server connectivitypyyaml>=5.4- YAML configuration filesazure-core>=1.34- Azure SDK core functionalityazure-identity>=1.23- Azure authenticationazure-storage-blob>=12.24- Azure Blob Storageazure-storage-file-datalake>=12.18- Azure Data Lake Storage
Optional Dependencies
ORCA Module (uv sync --extra orca or pip install tlpytools[orca]):
psutil>=5.8.0- System monitoring and performance trackingunittest-xml-reporting>=3.2.0- Enhanced test reporting
ActivitySim Module (Development only - via uv sync --group activitysim):
activitysim- TransLink's customized ActivitySim (from GitHub, not available on PyPI)populationsim- Synthetic population generation tool (from GitHub, not available on PyPI)
Note: ActivitySim and PopulationSim are only available through
uv sync --group activitysimfor development installations due to PyPI restrictions on git dependencies. They are not available as pip extras.
Development Environment (uv sync --extra dev or pip install tlpytools[dev]):
Geospatial Analysis Tools:
geopandas>=0.13.0- Geospatial data manipulationGDAL>=3.6.0- Geospatial data abstraction libraryShapely>=2.0.0- Geometric operationsFiona>=1.9.0- Vector data I/Opyproj>=3.4.0- Cartographic projectionsRtree>=1.0.0- Spatial indexingCartopy>=0.21.0- Cartographic projections for matplotlibcontextily>=1.5.0- Web map tiles for matplotlibfolium>=0.14.0- Interactive maps
Visualization and Dashboard Tools:
plotly>=5.17.0- Interactive plottingdash>=2.14.0- Web application frameworkdash-extensions>=1.0.0- Additional Dash componentsdash-leaflet>=0.1.0- Leaflet maps for Dashpanel>=1.3.0- High-level dashboard framework
Development and Code Quality:
black>=23.0.0- Code formattingruff>=0.1.0- Fast Python linterpytest>=7.4.0- Testing frameworkpytest-cov>=4.1.0- Coverage reportingmypy>=1.5.0- Static type checkingpre-commit>=3.4.0- Git pre-commit hooks
Other Utilities:
polyline>=2.0.0- Polyline encoding/decodingjupyter>=1.0.0- Jupyter ecosystemipykernel>=6.25.0- IPython kernel for Jupyter
Manual GDAL Installation
GDAL is not included in the default dependencies due to compilation complexity, but can be added manually for advanced geospatial operations:
# Add GDAL to your project
uv add "GDAL>=3.6.0"
Prerequisites for successful GDAL compilation:
Windows:
- Install Visual Studio Build Tools or Visual Studio Community
- Ensure C++ build tools are included
- Add Visual Studio tools to your PATH environment variable
Linux (Ubuntu/Debian):
# Install build dependencies
sudo apt-get update
sudo apt-get install build-essential libgdal-dev gdal-bin
Linux (RHEL/CentOS/Fedora):
# Install build dependencies
sudo dnf install gcc-c++ gdal-devel gdal
# or for older systems: sudo yum install gcc-c++ gdal-devel gdal
Note: GDAL compilation can be time-consuming and may fail on some systems due to missing system libraries. If you encounter issues, consider using system package managers or Docker environments for more reliable installations.
Configuration
TLPyTools uses YAML configuration files for most components. Example configuration:
# tlpytools_config.yaml
data_store:
backend: "local"
base_path: "data/"
versioning: true
cloud:
provider: "azure"
storage_account: "your_account"
container: "your_container"
logging:
level: "INFO"
file_output: true
console_output: true
Load configuration in your code:
from tlpytools.config import load_config
config = load_config("tlpytools_config.yaml")
Error Handling
TLPyTools provides graceful error handling for optional dependencies:
# This works even if geopandas is not installed
from tlpytools.data import read_spatial_data
try:
gdf = read_spatial_data("zones.shp")
except ImportError as e:
print(f"Spatial operations not available: {e}")
# Fallback to regular CSV reading
df = pd.read_csv("zones.csv")
Testing
Run the test suite:
# Using uv (recommended)
uv run pytest
# Run all tests with coverage
uv run pytest --cov=tests
# Run specific module tests
uv run pytest tests/test_data.py
# Run ORCA tests specifically
uv run pytest src/tlpytools/orca/tests/
# Using pip (alternative)
python -m pytest
# Run with coverage
python -m pytest --cov=tests
Contributing
We welcome contributions to TLPyTools! Please follow these guidelines:
- Fork the repository and create a feature branch
- Install development dependencies:
pip install -e .[dev] - Write tests for new functionality
- Follow code style guidelines (run
blackandruff) - Update documentation as needed
- Submit a pull request with a clear description
Development Setup
Using uv (Recommended)
# Install uv if not already installed
pip install uv
# Clone the repository
git clone https://github.com/TransLinkForecasting/tlpytools.git
cd tlpytools
# Quick setup using Makefile
make dev-setup
# Or manual setup:
# Install full development environment (includes GIS tools, visualization, etc.)
uv sync --extra dev --extra orca
# Note: ActivitySim and PopulationSim can be added with --group activitysim
# when working in a development setup (they're not PyPI extras but are uv dependency groups)
# Activate the virtual environment
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Run tests
uv run pytest
# Run code formatting
uv run black src/
uv run ruff check src/
# Run type checking
uv run mypy src/
Common Development Tasks
# Using the provided Makefile (recommended)
make help # Show all available commands
make install-all # Install all dependencies
make test # Run tests
make test-cov # Run tests with coverage
make lint # Check code style
make format # Format code
make type-check # Run type checking
make check-all # Run all quality checks
Using pip (Alternative)
# Clone the repository
git clone https://github.com/TransLinkForecasting/tlpytools.git
cd tlpytools
# Create development environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode (without ActivitySim - use uv for ActivitySim)
pip install -e .[orca,dev]
# Run tests
python -m pytest
Support
- Documentation: Comprehensive module documentation available
- Examples: See
examples/directory for usage examples - Issues: Report bugs and feature requests on GitHub Issues
- Email: Contact the TransLink Forecasting Team at forecasting@translink.ca
License
This project is proprietary software developed by TransLink. All rights reserved.
Version History
- 0.1.9:
- Add support for package-level .env file support to simplify set up and provide more flexibility
- 0.1.8:
- Set Python version to 3.10 to align with activitysim
- Add Azure Batch API caller part of orca
- Add unified logger for better maintainability
- Improve Azure credential handling to allow for differences between local testing and cloud production
- Fix minor bugs with workflows and release pipelines
- 0.1.7:
- Migration to uv for dependency management
- Fixed PyPI deployment: Moved ActivitySim and PopulationSim to uv dependency groups (no longer PyPI extras)
- ActivitySim now available via
uv sync --group activitysimfor development only - Comprehensive dev dependencies for geospatial analysis tools (GDAL, Shapely, GeoPandas, etc.)
- Added visualization tools (Plotly, Dash, Panel, Folium)
- Enhanced development tooling (Black, Ruff, MyPy, Pre-commit)
- ORCA namespace reorganization, improved modularity
- 0.1.6.1: Add data store support for RESUME_AFTER functionality, add ADLS and Azure SQL Server support
- 0.1.6.0: Enhanced cloud synchronization, performance monitoring
- 0.1.5.x: Core module stabilization, testing improvements
- 0.1.4.x: Initial SQL Server integration, configuration management
- 0.1.3.x: Data storage utilities, logging enhancements
- 0.1.2.x: Cloud storage integration, ADLS support
- 0.1.1.x: Core data processing utilities
- 0.1.0.x: Initial release with basic functionality
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tlpytools-0.1.9.tar.gz.
File metadata
- Download URL: tlpytools-0.1.9.tar.gz
- Upload date:
- Size: 405.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b492797a529e5290644a1b035245bf066de7bc31f2d5bae410abfcf7c10384f
|
|
| MD5 |
5b91e6663c608b113c330a59f87dc8c7
|
|
| BLAKE2b-256 |
6c26e85c3a1a5d4c68dd940fb8cee9bd6e9bde7d70dc1b0390919acc212e519a
|
File details
Details for the file tlpytools-0.1.9-py3-none-any.whl.
File metadata
- Download URL: tlpytools-0.1.9-py3-none-any.whl
- Upload date:
- Size: 87.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
606ca940e16290793141ea734666cfccdd61b161ab931fa5863ee5213042a439
|
|
| MD5 |
0ffb588bb2d8e1b6f44258c0014ea118
|
|
| BLAKE2b-256 |
3dbaad8dc8da7565f9b5ab5d597900237aa9b3420b05daf2f3a59672ed2c57b5
|