Dataset generation and management service for the Juniper ecosystem
Project description
Juniper Data
Dataset generation and management service for the Juniper ecosystem.
Overview
Juniper Data provides a centralized service for generating, storing, and serving datasets used by the Juniper neural network projects. It supports various dataset types including the classic two-spiral classification problem.
Ecosystem Compatibility
This service is part of the Juniper ecosystem. Verified compatible versions:
| juniper-data | juniper-cascor | juniper-canopy | data-client | cascor-client | cascor-worker |
|---|---|---|---|---|---|
| 0.4.x | 0.3.x | 0.2.x | >=0.3.1 | >=0.1.0 | >=0.1.0 |
For full-stack Docker deployment and integration tests, see juniper-deploy.
Architecture
JuniperData is the foundational data layer of the Juniper ecosystem. JuniperCascor and juniper-canopy both call JuniperData to generate and retrieve datasets.
┌─────────────────────┐ REST+WS ┌──────────────────────┐
│ juniper-canopy │ ◄──────────────► │ JuniperCascor │
│ Dashboard │ │ Training Svc │
│ Port 8050 │ │ Port 8200 │
└──────────┬──────────┘ └──────────┬───────────┘
│ REST │ REST
▼ ▼
┌──────────────────────────────────────────────────────────────┐
│ JuniperData ◄── (this service) │
│ Dataset Service · Port 8100 │
└──────────────────────────────────────────────────────────────┘
Data contract: datasets are served as NPZ archives with keys X_train, y_train, X_test, y_test, X_full, y_full (all float32).
Related Services
| Service | Relationship | Environment Variable |
|---|---|---|
| juniper-cascor | Consumes JuniperData for training datasets | JUNIPER_DATA_URL=http://localhost:8100 |
| juniper-canopy | Consumes JuniperData for visualization data | JUNIPER_DATA_URL=http://localhost:8100 |
| juniper-data-client | PyPI client library for this service | pip install juniper-data-client |
Service Configuration
| Variable | Default | Description |
|---|---|---|
JUNIPER_DATA_HOST |
0.0.0.0 |
Listen address |
JUNIPER_DATA_PORT |
8100 |
Service port |
JUNIPER_DATA_LOG_LEVEL |
INFO |
Log verbosity |
Docker Deployment
# Full stack with all three services:
git clone https://github.com/pcalnon/juniper-deploy.git
cd juniper-deploy && docker compose up --build
Installation
Basic Installation
pip install -e .
With API Support
pip install -e ".[api]"
Development Installation
pip install -e ".[dev]"
Full Installation
pip install -e ".[all]"
Quick Start
Generate a Spiral Dataset
from juniper_data.generators.spiral import SpiralGenerator
generator = SpiralGenerator()
dataset = generator.generate(n_points=100, n_spirals=2, noise=0.1)
Start the API Server
uvicorn juniper_data.api.app:app --reload
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/v1/health |
GET | Health check endpoint |
/v1/datasets |
GET | List available datasets |
/v1/datasets/{id} |
GET | Get a specific dataset |
/v1/generators/spiral |
POST | Generate a new spiral dataset |
/v1/generators/spiral/config |
GET | Get spiral generator configuration |
Project Structure
JuniperData/
├── juniper_data/
│ ├── core/ # Core functionality and base classes
│ ├── generators/ # Dataset generators
│ │ └── spiral/ # Spiral dataset generator
│ ├── storage/ # Dataset persistence layer
│ └── api/ # FastAPI application
│ └── routes/ # API route handlers
├── tests/
│ ├── unit/ # Unit tests
│ └── integration/ # Integration tests
├── pyproject.toml # Project configuration
└── README.md # This file
Development
Running Tests
pytest
Running Tests with Coverage
pytest --cov=juniper_data --cov-report=html
Code Formatting
black juniper_data tests
isort juniper_data tests
Type Checking
mypy juniper_data
Juniper Ecosystem
| Repository | Description |
|---|---|
| juniper-data | Dataset generation service (this repo) |
| juniper-cascor | CasCor neural network training service |
| juniper-canopy | Real-time monitoring dashboard |
| juniper-data-client | PyPI: juniper-data-client |
| juniper-cascor-client | PyPI: juniper-cascor-client |
| juniper-cascor-worker | PyPI: juniper-cascor-worker |
License
MIT License - Copyright (c) 2024-2026 Paul Calnon
Git Leaks
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file juniper_data-0.4.2.tar.gz.
File metadata
- Download URL: juniper_data-0.4.2.tar.gz
- Upload date:
- Size: 112.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c46a66fe3cd5b3bb65da230f211166c1819029c065aee3fa18f40a528b8666f
|
|
| MD5 |
9ead9597a0fe3de6da59dbd51cd49b06
|
|
| BLAKE2b-256 |
abcc1907cb9487420004ce1330494627a32fc2aafad6579d779ce9c193cec954
|
Provenance
The following attestation bundles were made for juniper_data-0.4.2.tar.gz:
Publisher:
publish.yml on pcalnon/juniper-data
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
juniper_data-0.4.2.tar.gz -
Subject digest:
1c46a66fe3cd5b3bb65da230f211166c1819029c065aee3fa18f40a528b8666f - Sigstore transparency entry: 997025682
- Sigstore integration time:
-
Permalink:
pcalnon/juniper-data@1968dd3a41962d7bfc40b8241ae95e6132a80969 -
Branch / Tag:
refs/tags/v0.4.2 - Owner: https://github.com/pcalnon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1968dd3a41962d7bfc40b8241ae95e6132a80969 -
Trigger Event:
release
-
Statement type:
File details
Details for the file juniper_data-0.4.2-py3-none-any.whl.
File metadata
- Download URL: juniper_data-0.4.2-py3-none-any.whl
- Upload date:
- Size: 153.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
786bfcd289344b395b0e3fe5f0b3f0a6de5b5860629e305df1b760b40a7c7218
|
|
| MD5 |
2f8cf48359795555bef1660766d83e12
|
|
| BLAKE2b-256 |
d87e636b4ec2448fadd8dd7812fdad169e104d9e25322f94d34551e3728cc923
|
Provenance
The following attestation bundles were made for juniper_data-0.4.2-py3-none-any.whl:
Publisher:
publish.yml on pcalnon/juniper-data
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
juniper_data-0.4.2-py3-none-any.whl -
Subject digest:
786bfcd289344b395b0e3fe5f0b3f0a6de5b5860629e305df1b760b40a7c7218 - Sigstore transparency entry: 997025728
- Sigstore integration time:
-
Permalink:
pcalnon/juniper-data@1968dd3a41962d7bfc40b8241ae95e6132a80969 -
Branch / Tag:
refs/tags/v0.4.2 - Owner: https://github.com/pcalnon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1968dd3a41962d7bfc40b8241ae95e6132a80969 -
Trigger Event:
release
-
Statement type: