Dataset of synthetic people for testing generative image/video pipelines at scale
Project description
OpenPeople
Dataset of synthetic people for testing generative image/video pipelines at scale.
What is OpenPeople?
OpenPeople provides a curated set of synthetic "people" designed for testing generative image and video pipelines. All people in this library are entirely synthetic. They do not represent real individuals (any similarity is purely coincidental).
Why use synthetic people?
- Safe testing: Test your pipelines without using real people's images
- Consistent subjects: Stable, reproducible characters across runs
- Prompt provenance: Every image includes the prompts used to generate it
- Diversity by design: Curated dataset covers various demographics
How was this dataset created?
Each synthetic person is generated through a multi-stage pipeline that ensures visual consistency across all assets.
1. Character Definition
Each person starts with a structured metadata.json defining their characteristics. These are randomly generated to ensure diversity:
Demographics are randomized across:
- Age ranges:
18-25,25-35,35-45,45-55,55-65,65+ - Sex:
male,female - Ethnicity: Mixed heritage with weighted percentages (e.g.,
"14% Western Asian, 50% European, 36% African") - Skin tone: Fitzpatrick scale I–VI
Visible traits include randomized:
- Hair: texture, length, color, style (e.g.,
"Very long, thick, naturally wavy curls") - Body type:
slim,athletic,average,curvy,large - Height:
short,average,tall
Additional traits add unique details like jawline, eyebrows, hairline, and other distinguishing features.
{
"person_id": "P050",
"character_details": {
"demographics": {
"age_range": "18-25",
"sex": "male",
"ethnicity": "14% Western Asian, 50% European, 36% African",
"skin_tone": "Fitzpatrick III"
},
"visible_traits": {
"hair": { "description": "Very long, thick, naturally wavy curls cascading over the shoulders." },
"body_type": "large",
"height": "average"
},
"additional_traits": {
"hairline": "Center-parted hair with a neat hairline",
"hands": "Long, slender fingers resting gently on the chin."
}
}
}
2. Multi-Stage Image Generation
Images are generated in a specific order, where each stage uses previous outputs as reference to maintain identity consistency:
┌─────────────────┐
│ Character JSON │
└────────┬────────┘
▼
┌─────────────────┐
│ Preview │ Temp full-body preview
└────────┬────────┘
▼
┌─────────────────┐
│ Studio Portrait │ Professional headshot
└────────┬────────┘
▼
┌─────────────────┐
│ Studio Posture │──────────────┐ Full-body studio shot
└────────┬────────┘ |
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Character Sheet │ │ Emotions Sheet │ Reference sheets
└────────┬────────┘ └────────┬────────┘
│ │
└───────────┬───────────┘
▼
┌─────────────────────┐
│ Amateur Photos │ Casual photos
└─────────────────────┘
3. Reference Chaining
The key to consistency is reference chaining — each generation stage receives images from previous stages:
- Studio Portrait: Uses the preview image to establish facial features
- Studio Posture: Uses the portrait to maintain face consistency in full-body shot
- Character Sheet: Combines portrait + posture with a layout template
- Emotions Sheet: Uses portrait with an emotions grid template
- Amateur Photos: Use the character/emotions sheets as reference for varied contexts
4. Generation Model
All images in the curated dataset are generated using Gemini 3 Pro image generation with carefully crafted prompts for each asset type. The prompts emphasize analytical photography, visible imperfections, and consistent styling.
Installation
pip install openpeople
Quick Start
import openpeople
# List all curated people
people = openpeople.curated.list()
print(f"Found {len(people)} synthetic people")
# Get a specific person
person = openpeople.curated.get("P001")
# Get a random person (non-deterministic)
random_person = openpeople.curated.random()
# Get a random person (deterministic with seed)
seeded_person = openpeople.curated.random(seed=42)
# Sample multiple people without replacement
sample = openpeople.curated.sample(n=3, seed=1234)
Working with Assets
Each person comes with multiple image assets:
| Asset Key | Description |
|---|---|
character_sheet |
Full-body multi-angle reference sheet |
emotions_sheet |
Facial expressions grid |
studio_selfie |
Portrait in studio setting |
studio_posture |
Full body in studio setting |
amateur_selfie |
Portrait in casual setting |
amateur_posture |
Full body in casual setting |
person = openpeople.curated.get("P001")
# Get path to an asset
path = person.asset_path("studio_selfie")
print(path)
# Load image directly (requires openpeople[images])
image = person.load_image("studio_selfie")
image.show()
⚠️ Disclaimer
Please verify on your own before usage.
These synthetic people are provided for testing and development purposes only. OpenPeople contains:
- No names or personal identifiers
- No real individuals
- No data suitable for identification purposes
Users are responsible for ensuring their use complies with applicable laws and ethical guidelines.
License
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please ensure:
- All synthetic people remain entirely fictional
- No names or real-person references
- Used prompts must be documented
- Tests pass before submitting PRs
Built with 💜 by Prompt Haus
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openpeople-0.1.2.tar.gz.
File metadata
- Download URL: openpeople-0.1.2.tar.gz
- Upload date:
- Size: 34.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0eed0d6f418a9bf55ea6ee28be34a7bf0fa7b0f83fa790f84ad1e9c165a7c8b
|
|
| MD5 |
2631083706793bb750959d77c1af23b6
|
|
| BLAKE2b-256 |
bf093890cf052f04630d81592d2e31f8c70049510a7924592d70f1f61d791ceb
|
Provenance
The following attestation bundles were made for openpeople-0.1.2.tar.gz:
Publisher:
workflow.yml on Prompt-Haus/OpenPeople
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openpeople-0.1.2.tar.gz -
Subject digest:
c0eed0d6f418a9bf55ea6ee28be34a7bf0fa7b0f83fa790f84ad1e9c165a7c8b - Sigstore transparency entry: 832783033
- Sigstore integration time:
-
Permalink:
Prompt-Haus/OpenPeople@83349b691965ff6759d7ca364f890176053e2415 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Prompt-Haus
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@83349b691965ff6759d7ca364f890176053e2415 -
Trigger Event:
release
-
Statement type:
File details
Details for the file openpeople-0.1.2-py3-none-any.whl.
File metadata
- Download URL: openpeople-0.1.2-py3-none-any.whl
- Upload date:
- Size: 34.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6faee04493aa7e3e913ba7bc720db90bb9e5f4b7e65eaaf74cd77f74c74e609f
|
|
| MD5 |
109df258b88b7752c16b5018104dd9b1
|
|
| BLAKE2b-256 |
870ffd779c58f89830a6961efa37c50d18787281fe3a8cd54069f8d6ca929590
|
Provenance
The following attestation bundles were made for openpeople-0.1.2-py3-none-any.whl:
Publisher:
workflow.yml on Prompt-Haus/OpenPeople
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openpeople-0.1.2-py3-none-any.whl -
Subject digest:
6faee04493aa7e3e913ba7bc720db90bb9e5f4b7e65eaaf74cd77f74c74e609f - Sigstore transparency entry: 832783034
- Sigstore integration time:
-
Permalink:
Prompt-Haus/OpenPeople@83349b691965ff6759d7ca364f890176053e2415 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Prompt-Haus
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@83349b691965ff6759d7ca364f890176053e2415 -
Trigger Event:
release
-
Statement type: