Skip to main content

Fetch sample data like images, videos, code, GIFs, text, and JSON files.

Project description

๐Ÿ“ฆ smpldta โ€” Sample Data Fetcher

smpldta is a Python library to fetch and generate real sample data like images, videos, gifs, code, JSON, text files, and PDFs โ€” in any quantity you specify.

It helps developers, testers, and data scientists quickly build test environments, validate pipelines, or create dummy data for demos, machine learning, or automation.


๐Ÿš€ Features

  • ๐Ÿ“ธ Download real images from the web with size/dimension constraints
  • ๐ŸŽฅ Fetch videos in multiple formats like mp4, mkv, flv, 3gp
  • ๐ŸŽž๏ธ Get animated GIFs from Giphy
  • ๐Ÿ“„ Generate PDFs with size limits
  • ๐Ÿ“ Create structured JSON files with a schema
  • ๐Ÿ’ฌ Generate random text files with word/size limits
  • ๐Ÿ’ป Generate code files in Python, Java, JavaScript, C, etc.

๐Ÿ“ฆ smpldta

smpldta is a Python-based utility that generates sample data files for testing and prototyping. It supports fetching images, videos, code snippets, text, JSON, and PDFsโ€”organized in a clear directory structure.


๐Ÿ“š Table of Contents


๐Ÿ› ๏ธ Installation

pip install smpldta

๐Ÿ–ผ๏ธ fetch_images(config)

config = {
    "jpg": {
        "count": 3,
        "min_size": "5kb",
        "max_size": "500kb",
        "height": 400,
        "width": 400
    },
    "jpeg": {
        "count": 2,
        "min_size": "10kb",
        "max_size": "1mb",
        "height": 600,
        "width": 300
    }
}
fetcher.fetch_images(config, subdir="name_of_the_folder")

๐ŸŽฅ fetch_videos(config)

config = {
    "mp4": {
        "count": 3,
        "max_size": "20mb"
    },
    "3gp": {
        "count": 2,
        "max_size": "10mb"
    }
}
fetcher.fetch_videos(config)

๐ŸŽž๏ธ fetch_gifs(count)

fetcher.fetch_gifs(count=5)

๐Ÿ’ป fetch_code(config)

config = {
    "python": 2,
    "java": 2,
    "c": 1,
    "cpp": 1,
    "javascript": 1,
    "typescript": 1
}
fetcher.fetch_code(config)

๐Ÿ“ fetch_text(config)

config = {
    "count": 4,
    "min_words": 100,
    "max_words": 1000,
    "max_size": "200kb"
}
fetcher.fetch_text(config)

๐Ÿงพ fetch_json(config)

config = {
    "schema": {
        "id": "uuid",
        "name": "str",
        "email": "email",
        "age": "int",
        "joined": "date",
        "score": "float"
    },
    "min_data_per_file": 5,
    "max_data_per_file": 15,
    "count": 5
}
fetcher.fetch_json(config)

๐Ÿ“„ fetch_pdfs(config)

config = {
    "count": 3,
    "min_size": "100kb",
    "max_size": "500kb"
}
fetcher.fetch_pdfs(config)

๐Ÿ“‚ Output Structure

output/
โ”œโ”€โ”€ images/
โ”œโ”€โ”€ videos/
โ”œโ”€โ”€ gifs/
โ”œโ”€โ”€ code/
โ”œโ”€โ”€ text/
โ”œโ”€โ”€ json/
โ””โ”€โ”€ pdfs/

Each data type is saved in its own subfolder with unique filenames.

To change the names of the folder use this

subdir="name_of_the_folder"

Example:
fetcher.fetch_images(config, subdir="name_of_the_folder")

๐Ÿ’ก Why Use smpldta?

  • Eliminate the need to manually source or generate test data
  • Supports a variety of formats and customization options
  • Ideal for pipelines, automated tests, and demos

๐Ÿ“„ License

MIT License

Author: Parteek

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smpldta-0.1.2.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smpldta-0.1.2-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file smpldta-0.1.2.tar.gz.

File metadata

  • Download URL: smpldta-0.1.2.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for smpldta-0.1.2.tar.gz
Algorithm Hash digest
SHA256 1e97137d42ef5ba0f2b14dd324e94537eda0d88e35d700c75ed0663053f9381b
MD5 7c4d18d2032f880396db8eb871ab5f2e
BLAKE2b-256 48f4d0dd224828e788f2cfee7b96f38659ee315c3cc3406e31498bc0b079c42f

See more details on using hashes here.

File details

Details for the file smpldta-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: smpldta-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for smpldta-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 36bbf0438083a2c1c9133001bd6469de738aaf364946828a38420ef59779fba9
MD5 57e0e635451a6f3c93a2e0827afb3525
BLAKE2b-256 0698ca241111f3d5456fda9f64e05fb0519784428f4f6b34b245d50203fcbd45

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page