Feature discovery and generation utilities
Project description
LLM_feature_gen
LLM Feature Gen is a Python library for discovering and generating interpretable features from unstructured data using Large Language Models (LLMs).
The library provides high-level utilities for:
- Discovering human-interpretable features from sets of images,
- Integrating prompts and model outputs into structured JSON representations,
-
- Generating new feature representations automatically from raw multimodal data, e.g., creating structured tables for downstream models,
Module: discover
The discover module focuses on feature discovery — identifying interpretable, discriminative visual or textual properties using an LLM.
✅ What it does
Given a folder of images and a prompt, the library:
- Converts each image into Base64 format,
- Sends them to an LLM,
- Receives a structured JSON response describing the discovered features,
- Automatically saves the output to a JSON file in
outputs/.
📂 Project Structure
LLM_feature_gen/
├─ src/
│ └─ LLM_feature_gen/
│ ├─ init.py
│ ├─ discover.py # High-level orchestration for feature discovery
│ ├─ providers/
├─ openai_provider.py # OpenAI API wrapper
│ ├─ local_provider.py # Local LLM wrapper
│ ├─ prompts/
│ │ ├─ discovery_prompt.txt # Default reasoning prompt
├─ generation_prompt.txt # Default feature generation prompt
│ ├─ utils/
│ │ └─ image.py # Image → base64 conversion
│ └─ tests/
│ └─ test_discover.py
├─ outputs/ # Automatically generated feature JSONs
├─ pyproject.toml
└─ README.md
⚙️ Installation
Clone or download the repository, then install in editable mode:
pip install -e .
🔑 Environment Setup for OpenAI API
Create a .env file in the project root
Example: Discover Features from Images
from LLM_feature_gen.discover import discover_features_from_images
# Folder with your example images
image_folder = "discover_images"
# Run feature discovery
result = discover_features_from_images(
image_paths_or_folder=image_folder,
as_set=True, # analyze all images jointly
)
print(result)
This will:
- Read all .jpg/.png images from discover_images/
- the default prompt (prompts/image_discovery_prompt.txt)
- Send them to your LLM provider
- Save the results to outputs/discovered_features_.json
Example saved JSON:
{
"proposed_features": [
{
"feature": "has visible handle",
"description": "Some objects include handles, others do not.",
"possible_values": ["present", "absent"]
},
{
"feature": "color tone",
"description": "Images vary between metallic and earthy color palettes.",
"possible_values": ["metallic", "matte", "bright", "dark"]
}
]
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_feature_gen-0.1.0.tar.gz.
File metadata
- Download URL: llm_feature_gen-0.1.0.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c992ed6129716d5719a2b08c86b0e07633f52afcf6bbd2a55258c9d28dba0f48
|
|
| MD5 |
f8df7f8db9cee1984a9b62365a9defb7
|
|
| BLAKE2b-256 |
bd974728bbe280c9356158a510b67730d5d6c9653082e5535c140490aa08fd09
|
File details
Details for the file llm_feature_gen-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_feature_gen-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b50b897519d83e4117c93e995f2dd2e564a4ab294c02e0092ad9f48b49261f7b
|
|
| MD5 |
296ab2da6ad1f6699b1910c0a6debeeb
|
|
| BLAKE2b-256 |
a360143eeb80a56310b7bb29e45a8638660bc777fa9c0c51f999bd8135f1f3a1
|