Skip to main content

Dataset of synthetic people for testing generative image/video pipelines at scale

Project description

OpenPeople

Dataset of synthetic people for testing generative image/video pipelines at scale.

License: MIT Python 3.9+

What is OpenPeople?

OpenPeople provides a curated set of synthetic "people" designed for testing generative image and video pipelines. All people in this library are entirely synthetic. They do not represent real individuals (any similarity is purely coincidental).

Why use synthetic people?

  • Safe testing: Test your pipelines without using real people's images
  • Consistent subjects: Stable, reproducible characters across runs
  • Prompt provenance: Every image includes the prompts used to generate it
  • Diversity by design: Curated dataset covers various demographics

How was this dataset created?

Each synthetic person is generated through a multi-stage pipeline that ensures visual consistency across all assets.

1. Character Definition

Each person starts with a structured metadata.json defining their characteristics. These are randomly generated to ensure diversity:

Demographics are randomized across:

  • Age ranges: 18-25, 25-35, 35-45, 45-55, 55-65, 65+
  • Sex: male, female
  • Ethnicity: Mixed heritage with weighted percentages (e.g., "14% Western Asian, 50% European, 36% African")
  • Skin tone: Fitzpatrick scale I–VI

Visible traits include randomized:

  • Hair: texture, length, color, style (e.g., "Very long, thick, naturally wavy curls")
  • Body type: slim, athletic, average, curvy, large
  • Height: short, average, tall

Additional traits add unique details like jawline, eyebrows, hairline, and other distinguishing features.

{
  "person_id": "P050",
  "character_details": {
    "demographics": {
      "age_range": "18-25",
      "sex": "male",
      "ethnicity": "14% Western Asian, 50% European, 36% African",
      "skin_tone": "Fitzpatrick III"
    },
    "visible_traits": {
      "hair": { "description": "Very long, thick, naturally wavy curls cascading over the shoulders." },
      "body_type": "large",
      "height": "average"
    },
    "additional_traits": {
      "hairline": "Center-parted hair with a neat hairline",
      "hands": "Long, slender fingers resting gently on the chin."
    }
  }
}

2. Multi-Stage Image Generation

Images are generated in a specific order, where each stage uses previous outputs as reference to maintain identity consistency:

┌─────────────────┐
│ Character JSON  │
└────────┬────────┘
         ▼
┌─────────────────┐
│    Preview      │  Temp full-body preview
└────────┬────────┘
         ▼
┌─────────────────┐
│ Studio Portrait │  Professional headshot
└────────┬────────┘
         ▼
┌─────────────────┐
│ Studio Posture  │──────────────┐  Full-body studio shot
└────────┬────────┘              |
         ▼                       ▼
┌─────────────────┐     ┌─────────────────┐
│ Character Sheet │     │ Emotions Sheet  │  Reference sheets
└────────┬────────┘     └────────┬────────┘
         │                       │
         └───────────┬───────────┘
                     ▼
         ┌─────────────────────┐
         │   Amateur Photos    │  Casual photos
         └─────────────────────┘

3. Reference Chaining

The key to consistency is reference chaining — each generation stage receives images from previous stages:

  • Studio Portrait: Uses the preview image to establish facial features
  • Studio Posture: Uses the portrait to maintain face consistency in full-body shot
  • Character Sheet: Combines portrait + posture with a layout template
  • Emotions Sheet: Uses portrait with an emotions grid template
  • Amateur Photos: Use the character/emotions sheets as reference for varied contexts

4. Generation Model

All images in the curated dataset are generated using Gemini 3 Pro image generation with carefully crafted prompts for each asset type. The prompts emphasize analytical photography, visible imperfections, and consistent styling.

Installation

pip install openpeople

Quick Start

import openpeople

# List all curated people
people = openpeople.curated.list()
print(f"Found {len(people)} synthetic people")

# Get a specific person
person = openpeople.curated.get("P001")

# Get a random person (non-deterministic)
random_person = openpeople.curated.random()

# Get a random person (deterministic with seed)
seeded_person = openpeople.curated.random(seed=42)

# Sample multiple people without replacement
sample = openpeople.curated.sample(n=3, seed=1234)

Working with Assets

Each person comes with multiple image assets:

Asset Key Description
character_sheet Full-body multi-angle reference sheet
emotions_sheet Facial expressions grid
studio_selfie Portrait in studio setting
studio_posture Full body in studio setting
amateur_selfie Portrait in casual setting
amateur_posture Full body in casual setting
person = openpeople.curated.get("P001")

# Get path to an asset
path = person.asset_path("studio_selfie")
print(path)

# Load image directly (requires openpeople[images])
image = person.load_image("studio_selfie")
image.show()

⚠️ Disclaimer

Please verify on your own before usage.

These synthetic people are provided for testing and development purposes only. OpenPeople contains:

  • No names or personal identifiers
  • No real individuals
  • No data suitable for identification purposes

Users are responsible for ensuring their use complies with applicable laws and ethical guidelines.

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please ensure:

  1. All synthetic people remain entirely fictional
  2. No names or real-person references
  3. Used prompts must be documented
  4. Tests pass before submitting PRs

Built with 💜 by Prompt Haus

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpeople-0.1.2.tar.gz (34.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openpeople-0.1.2-py3-none-any.whl (34.2 MB view details)

Uploaded Python 3

File details

Details for the file openpeople-0.1.2.tar.gz.

File metadata

  • Download URL: openpeople-0.1.2.tar.gz
  • Upload date:
  • Size: 34.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openpeople-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c0eed0d6f418a9bf55ea6ee28be34a7bf0fa7b0f83fa790f84ad1e9c165a7c8b
MD5 2631083706793bb750959d77c1af23b6
BLAKE2b-256 bf093890cf052f04630d81592d2e31f8c70049510a7924592d70f1f61d791ceb

See more details on using hashes here.

Provenance

The following attestation bundles were made for openpeople-0.1.2.tar.gz:

Publisher: workflow.yml on Prompt-Haus/OpenPeople

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openpeople-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: openpeople-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 34.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openpeople-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6faee04493aa7e3e913ba7bc720db90bb9e5f4b7e65eaaf74cd77f74c74e609f
MD5 109df258b88b7752c16b5018104dd9b1
BLAKE2b-256 870ffd779c58f89830a6961efa37c50d18787281fe3a8cd54069f8d6ca929590

See more details on using hashes here.

Provenance

The following attestation bundles were made for openpeople-0.1.2-py3-none-any.whl:

Publisher: workflow.yml on Prompt-Haus/OpenPeople

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page