Fetch sample data like images, videos, code, GIFs, text, and JSON files.
Project description
๐ฆ smpldta โ Sample Data Fetcher
smpldta is a Python library to fetch and generate real sample data like images, videos, gifs, code, JSON, text files, and PDFs โ in any quantity you specify.
It helps developers, testers, and data scientists quickly build test environments, validate pipelines, or create dummy data for demos, machine learning, or automation.
๐ Features
- ๐ธ Download real images from the web with size/dimension constraints
- ๐ฅ Fetch videos in multiple formats like
mp4,mkv,flv,3gp - ๐๏ธ Get animated GIFs from Giphy
- ๐ Generate PDFs with size limits
- ๐ Create structured JSON files with a schema
- ๐ฌ Generate random text files with word/size limits
- ๐ป Generate code files in Python, Java, JavaScript, C, etc.
๐ฆ smpldta
smpldta is a Python-based utility that generates sample data files for testing and prototyping. It supports fetching images, videos, code snippets, text, JSON, and PDFsโorganized in a clear directory structure.
๐ Table of Contents
๐ ๏ธ Installation
pip install smpldta
๐ผ๏ธ fetch_images(config)
config = {
"jpg": {
"count": 3,
"min_size": "5kb",
"max_size": "500kb",
"height": 400,
"width": 400
},
"jpeg": {
"count": 2,
"min_size": "10kb",
"max_size": "1mb",
"height": 600,
"width": 300
}
}
fetcher.fetch_images(config, subdir="name_of_the_folder")
๐ฅ fetch_videos(config)
config = {
"mp4": {
"count": 3,
"max_size": "20mb"
},
"3gp": {
"count": 2,
"max_size": "10mb"
}
}
fetcher.fetch_videos(config)
๐๏ธ fetch_gifs(count)
fetcher.fetch_gifs(count=5)
๐ป fetch_code(config)
config = {
"python": 2,
"java": 2,
"c": 1,
"cpp": 1,
"javascript": 1,
"typescript": 1
}
fetcher.fetch_code(config)
๐ fetch_text(config)
config = {
"count": 4,
"min_words": 100,
"max_words": 1000,
"max_size": "200kb"
}
fetcher.fetch_text(config)
๐งพ fetch_json(config)
config = {
"schema": {
"id": "uuid",
"name": "str",
"email": "email",
"age": "int",
"joined": "date",
"score": "float"
},
"min_data_per_file": 5,
"max_data_per_file": 15,
"count": 5
}
fetcher.fetch_json(config)
๐ fetch_pdfs(config)
config = {
"count": 3,
"min_size": "100kb",
"max_size": "500kb"
}
fetcher.fetch_pdfs(config)
๐ Output Structure
output/
โโโ images/
โโโ videos/
โโโ gifs/
โโโ code/
โโโ text/
โโโ json/
โโโ pdfs/
Each data type is saved in its own subfolder with unique filenames.
To change the names of the folder use this
subdir="name_of_the_folder"
Example:
fetcher.fetch_images(config, subdir="name_of_the_folder")
๐ก Why Use smpldta?
- Eliminate the need to manually source or generate test data
- Supports a variety of formats and customization options
- Ideal for pipelines, automated tests, and demos
๐ License
MIT License
Author: Parteek
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smpldta-0.1.2.tar.gz.
File metadata
- Download URL: smpldta-0.1.2.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e97137d42ef5ba0f2b14dd324e94537eda0d88e35d700c75ed0663053f9381b
|
|
| MD5 |
7c4d18d2032f880396db8eb871ab5f2e
|
|
| BLAKE2b-256 |
48f4d0dd224828e788f2cfee7b96f38659ee315c3cc3406e31498bc0b079c42f
|
File details
Details for the file smpldta-0.1.2-py3-none-any.whl.
File metadata
- Download URL: smpldta-0.1.2-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36bbf0438083a2c1c9133001bd6469de738aaf364946828a38420ef59779fba9
|
|
| MD5 |
57e0e635451a6f3c93a2e0827afb3525
|
|
| BLAKE2b-256 |
0698ca241111f3d5456fda9f64e05fb0519784428f4f6b34b245d50203fcbd45
|