Generate synthetic STAC (SpatioTemporal Asset Catalog) items with realistic metadata
Project description
STAC Sample Maker
Generate synthetic STAC (SpatioTemporal Asset Catalog) items with realistic metadata for testing, development, and demonstration purposes.
Features
- 🌍 Complete STAC compliance: Generates valid STAC v1.1.0 items
- 🔧 All stable extensions: Supports EO, File, Projection, View, SAR, Scientific, and Landsat extensions
- 📝 Template-based generation: Create items matching existing STAC item structures
- ✅ Schema validation: Optional validation using stac-validator
- 🎯 Realistic data: Uses Faker to generate believable synthetic metadata
- 📊 Common metadata: Includes STAC common metadata fields (license, providers, etc.)
- 🔄 Flexible output: NDJSON format with stdout or file output
- 🎲 Reproducible: Seed support for consistent outputs
Installation
Basic Installation
pip install stac-sample-maker
With Validation Support
pip install stac-sample-maker[validation]
Development Installation
git clone https://github.com/username/stac-sample-maker.git
cd stac-sample-maker
pip install -e ".[dev,validation]"
Quick Start
Command Line Usage
Generate 5 synthetic STAC items:
stac-sample-maker --num-items 5
Generate items with validation:
stac-sample-maker --num-items 3 --validate
Generate items from a template:
stac-sample-maker --template my-stac-item.json --num-items 10
Save to a file:
stac-sample-maker --num-items 5 --output samples.ndjson
Library Usage
from stac_sample_maker import generate_stac_items
# Generate basic items
items = generate_stac_items(num_items=5)
# Generate with specific parameters
items = generate_stac_items(
num_items=10,
start_date="2020-01-01T00:00:00Z",
end_date="2023-12-31T23:59:59Z",
bbox=[-122.5, 37.7, -122.3, 37.8], # San Francisco area
seed=42, # For reproducible results
validate=True # Requires stac-validator
)
# Generate from template
from stac_sample_maker import generate_stac_items_from_template
items = generate_stac_items_from_template(
template_path="example-item.json",
num_items=5
)
Command Line Interface
stac-sample-maker [OPTIONS]
Options
| Option | Description | Default |
|---|---|---|
--num-items N |
Number of STAC items to generate | 1 |
--template PATH |
Path to template STAC item JSON file | None |
--output PATH |
Output file path (NDJSON format) | stdout |
--start DATE |
Start of datetime range (ISO 8601) | None |
--end DATE |
End of datetime range (ISO 8601) | None |
--interval-percent FLOAT |
Fraction of items using intervals (0-1) | 0.2 |
--bbox MINX MINY MAXX MAXY |
Bounding box for geometry | None |
--seed INT |
Random seed for reproducibility | None |
--validate |
Validate items against STAC schema | False |
Examples
# Generate 100 items for 2023 with 50% using time intervals
stac-sample-maker --num-items 100 \
--start "2023-01-01T00:00:00Z" \
--end "2023-12-31T23:59:59Z" \
--interval-percent 0.5
# Generate items within a bounding box (San Francisco)
stac-sample-maker --num-items 20 \
--bbox -122.5 37.7 -122.3 37.8
# Generate reproducible items with validation
stac-sample-maker --num-items 10 \
--seed 42 \
--validate \
--output validated-items.ndjson
Library API
Core Functions
generate_stac_items()
Generate synthetic STAC items with all extensions.
def generate_stac_items(
num_items: int,
start_date: Optional[str] = None,
end_date: Optional[str] = None,
interval_percent: float = 0.2,
bbox: Optional[Tuple[float, float, float, float]] = None,
seed: Optional[int] = None,
extensions: Optional[List[str]] = None,
validate: bool = False,
) -> List[dict]:
generate_stac_items_from_template()
Generate STAC items matching a template structure.
def generate_stac_items_from_template(
template_path: str,
num_items: int,
start_date: Optional[str] = None,
end_date: Optional[str] = None,
bbox: Optional[Tuple[float, float, float, float]] = None,
seed: Optional[int] = None,
validate: bool = False,
) -> List[Dict[str, Any]]:
validate_stac_item()
Validate a STAC item against JSON schema.
def validate_stac_item(
item: Dict[str, Any],
strict: bool = True
) -> bool:
Supported Extensions
- EO (Electro-Optical): Cloud cover, snow cover, spectral bands
- File: File size, checksums, header information
- Projection: EPSG codes, transforms, shapes, bounding boxes
- View: Viewing angles, sun position, off-nadir angles
- SAR: Radar-specific metadata (frequency, polarization, etc.)
- Scientific: DOI citations, publications
- Landsat: Landsat-specific metadata (WRS path/row, etc.)
Template Mode
Template mode analyzes an existing STAC item and generates new items with the same structure:
- Extensions: Matches the exact extensions used
- Properties: Generates new values for the same property fields
- Assets: Creates assets with the same keys and structure
- Temporal: Preserves instant vs. interval temporal patterns
Example template workflow:
# Save one generated item as a template
items = generate_stac_items(num_items=1)
with open("template.json", "w") as f:
json.dump(items[0], f)
# Generate more items matching the template
similar_items = generate_stac_items_from_template(
template_path="template.json",
num_items=100
)
STAC Compliance
Generated items are fully compliant with:
- STAC Specification v1.1.0
- All stable STAC extensions
- STAC Common Metadata (license, providers, platform, etc.)
- GeoJSON standards for geometry
- ISO 8601 for datetime formatting
Development
Setup
git clone https://github.com/username/stac-sample-maker.git
cd stac-sample-maker
pip install -e ".[dev,validation]"
pre-commit install
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=stac_sample_maker --cov-report=html
# Run specific tests
pytest tests/test_generator.py -v
Code Quality
# Format code
ruff format
# Lint code
ruff check
# Type checking
mypy stac_sample_maker
Pre-commit Hooks
The project uses pre-commit hooks for code quality:
pre-commit run --all-files
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests and linting (
pytest && ruff check && mypy stac_sample_maker) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- STAC Specification for the standard
- Faker for synthetic data generation
- stac-validator for validation support
Related Projects
- PySTAC - Python library for working with STAC
- STAC Validator - Validation tools
- STAC Browser - Browse STAC catalogs
- STAC Index - Discover STAC resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stac_sample_maker-0.0.1.tar.gz.
File metadata
- Download URL: stac_sample_maker-0.0.1.tar.gz
- Upload date:
- Size: 26.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51461cbc586ce6757d8649afe2601e9f4eac3a6867ada50c4bf5d64a327f185b
|
|
| MD5 |
1a92d37e8d420caade850a82b36be096
|
|
| BLAKE2b-256 |
2366a6e3a064b8515dbcce6d2aaec0d1764dbdce53dab381f5f2a9a6ab0a72e3
|
File details
Details for the file stac_sample_maker-0.0.1-py3-none-any.whl.
File metadata
- Download URL: stac_sample_maker-0.0.1-py3-none-any.whl
- Upload date:
- Size: 22.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9dcf38888fa774562f3b7189ceb964960d2b2301197323b519dc1780bfbbc7a9
|
|
| MD5 |
501c72af3d3530f95a55e5273f174e11
|
|
| BLAKE2b-256 |
4f4b2f1c70ceed44e94bd6842221759aac7ca6440d261cbda132bba8b61c8a43
|