A lightweight Python library for generating realistic temporary datasets

These details have not been verified by PyPI

Project links

Project description

TempDataset

A lightweight Python library for generating realistic temporary datasets for testing and development. No heavy dependencies required - works with just the Python standard library!

Features

Lightweight: Zero dependencies for core functionality
Multiple Formats: Generate CSV, JSON, or in-memory datasets
Realistic Data: Built-in datasets with realistic patterns
Extensible: Easy to add custom dataset types
Memory Efficient: Optimized for large dataset generation
Python 3.7+: Compatible with modern Python versions

Quick Start

Installation

pip install tempdataset

For additional features with Faker support:

pip install tempdataset[faker]

Basic Usage

import tempdataset

# Generate 1000 rows of sales data
data = tempdataset.create_dataset('sales', 1000)
print(data.head())

# Save directly to CSV
tempdataset.create_dataset('sales.csv', 500)

# Save directly to JSON
tempdataset.create_dataset('sales.json', 500)

# Read data back
csv_data = tempdataset.read_csv('sales.csv')
json_data = tempdataset.read_json('sales.json')

Available Datasets

Sales Dataset

Generates realistic sales transaction data with:

Transaction IDs
Customer information
Product details
Sales amounts and quantities
Timestamps
Geographic data

# Generate sales data
sales_data = tempdataset.create_dataset('sales', 1000)

# Access data
print(f"Generated {len(sales_data)} rows")
print(f"Columns: {sales_data.columns}")
print(f"Memory usage: {sales_data.memory_usage()}")

Advanced Usage

Working with TempDataFrame

data = tempdataset.create_dataset('sales', 1000)

# Basic operations
print(data.head(10))          # First 10 rows
print(data.tail(5))           # Last 5 rows
print(data.describe())        # Statistical summary
print(data.info())            # Data info

# Filtering and selection
filtered = data.filter(lambda row: row['amount'] > 100)
selected = data.select(['customer_name', 'amount', 'date'])

# Export options
data.to_csv('output.csv')
data.to_json('output.json')
data.to_dict()                # Convert to dictionary

Performance Monitoring

import tempdataset

# Generate data
data = tempdataset.create_dataset('sales', 10000)

# Check performance stats
stats = tempdataset.get_performance_stats()
print(f"Generation time: {stats['generation_time']:.2f}s")
print(f"Memory usage: {stats['memory_usage']:.2f}MB")

# Reset stats for next operation
tempdataset.reset_performance_stats()

Development

Setting up Development Environment

# Clone the repository
git clone https://github.com/dot-css/TempDataset.git
cd TempDataset

# Install development dependencies
pip install -e .[dev]

# Run tests
pytest

# Run tests with coverage
pytest --cov=tempdataset

# Run performance benchmarks
pytest .benchmarks/

Running Tests

# Run all tests
pytest

# Run specific test categories
pytest -m "not slow"          # Skip slow tests
pytest -m integration         # Only integration tests
pytest -m performance         # Only performance tests

# Run with coverage report
pytest --cov=tempdataset --cov-report=html

Code Quality

# Format code
black tempdataset tests

# Lint code
flake8 tempdataset tests

# Type checking
mypy tempdataset

API Reference

Core Functions

`create_dataset(dataset_type, rows=500)`

Generate temporary datasets or save to files.

Parameters:

dataset_type (str): Dataset type ('sales') or filename ('sales.csv', 'sales.json')
rows (int): Number of rows to generate (default: 500)

Returns:

TempDataFrame containing the generated data (also saves to file if filename provided)

`read_csv(filename)`

Read CSV file into TempDataFrame.

`read_json(filename)`

Read JSON file into TempDataFrame.

TempDataFrame Methods

head(n=5): Get first n rows
tail(n=5): Get last n rows
describe(): Statistical summary
info(): Dataset information
filter(func): Filter rows by function
select(columns): Select specific columns
to_csv(filename): Export to CSV
to_json(filename): Export to JSON
to_dict(): Convert to dictionary

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Workflow

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Run the test suite
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

See CHANGELOG.md for a detailed history of changes.

Support

Documentation: https://tempdataset.readthedocs.io/
Issue Tracker: https://github.com/dot-css/TempDataset/issues
Discussions: https://github.com/dot-css/TempDataset/discussions

Acknowledgments

Built with love for the Python testing community
Inspired by the need for lightweight, dependency-free test data generation
Thanks to all contributors who help make this project better!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Aug 12, 2025

0.1.2

Aug 10, 2025

0.1.1

Aug 9, 2025

This version

0.1.0

Aug 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tempdataset-0.1.0.tar.gz (35.3 kB view details)

Uploaded Aug 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tempdataset-0.1.0-py3-none-any.whl (63.3 kB view details)

Uploaded Aug 8, 2025 Python 3

File details

Details for the file tempdataset-0.1.0.tar.gz.

File metadata

Download URL: tempdataset-0.1.0.tar.gz
Upload date: Aug 8, 2025
Size: 35.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for tempdataset-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`dc01ed85a78283772843be0474fde7390e239f87d7eded97b7175160cad57328`
MD5	`b68cf1ae75040a87a4a2c8d1c72bf6d3`
BLAKE2b-256	`dc315103230c1bbf114763793fd506b10b3e4d4f59a641a63734399043fea4dd`

See more details on using hashes here.

File details

Details for the file tempdataset-0.1.0-py3-none-any.whl.

File metadata

Download URL: tempdataset-0.1.0-py3-none-any.whl
Upload date: Aug 8, 2025
Size: 63.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for tempdataset-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a490132f08960285728180117ab988ead3e12252df59c197f5adbe71e4ccef15`
MD5	`dda05d91b1b5702ebeaee573ae3bac99`
BLAKE2b-256	`8b8a0d6b08e28be483b06535da55d96cabbf187d7721f394ecf1aa72ab87cfbf`

See more details on using hashes here.

tempdataset 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TempDataset

Features

Quick Start

Installation

Basic Usage

Available Datasets

Sales Dataset

Advanced Usage

Working with TempDataFrame

Performance Monitoring

Development

Setting up Development Environment

Running Tests

Code Quality

API Reference

Core Functions

create_dataset(dataset_type, rows=500)

read_csv(filename)

read_json(filename)

TempDataFrame Methods

Contributing

Development Workflow

License

Changelog

Support

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`create_dataset(dataset_type, rows=500)`

`read_csv(filename)`

`read_json(filename)`