A package for discourse-level scene graph parsing and evaluation.
Project description
DualTaskSceneGraphParser Usage Guide
Installation
pip install discosg
Quick Start
Basic Import
from discosg import DualTaskSceneGraphParser
Initialize Model
# Create parser instance
model = DualTaskSceneGraphParser(
model_path="sqlinn/DiscoSG-Refiner-Large-t5-only", # Model path
device="cuda", # Device: "cuda" or "cpu"
lemmatize=False, # Whether to lemmatize text
lowercase=True # Whether to convert to lowercase
)
Prepare Input Data
1. Image Descriptions
descriptions = [
"The image captures a bustling urban scene, likely in a European city...",
"In the image, a man is seated at a desk, engrossed in his work on a computer..."
]
2. Scene Graphs to Fix (Optional)
graph_to_fix = {
"description_text_1": "( subject , predicate , object ) , ( subject2 , predicate2 , object2 ) , ...",
"description_text_2": "( subject , predicate , object ) , ( subject2 , predicate2 , object2 ) , ..."
}
Execute Parsing
outputs = model.parse(
descriptions=descriptions, # List of image descriptions
graph_to_fix=graph_to_fix, # Dictionary of scene graphs to fix (optional)
batch_size=2, # Batch size for processing
task="delete_before_insert" # Task type
)
Complete Example
from discosg.parser.DualTaskSceneGraphParser import DualTaskSceneGraphParser
def main():
# Initialize model
model = DualTaskSceneGraphParser(
model_path="sqlinn/DiscoSG-Refiner-Large-t5-only",
device="cuda",
lemmatize=False,
lowercase=True
)
# Prepare image descriptions
descriptions = [
"The image captures a bustling urban scene, likely in a European city. The setting appears to be a pedestrian-friendly square or plaza. There are numerous people of various ages and attire walking around, some carrying bags, suggesting shopping or a day out. A few individuals are seated, possibly enjoying a meal or resting. The square is adorned with a decorative fountain in the center, surrounded by potted plants. Overhead, there are power lines and cables, hinting at an urban environment. The architecture of the surrounding buildings suggests a historic or older part of the city.",
"In the image, a man is seated at a desk, engrossed in his work on a computer. He's wearing a blue shirt and glasses, and his hand is raised to his forehead in a gesture that suggests deep thought or concentration. The desk, cluttered with various items, houses a computer monitor, keyboard, and mouse. The room around him is dimly lit, creating an atmosphere of focus and seriousness. In the background, a window can be seen, adding depth to the scene. The image captures a moment of intense concentration and productivity."
]
# Prepare scene graphs to fix
graph_to_fix = {
descriptions[0]: "( city , is , bustling ) , ( city , is , european ) , ( setting , is , pedestrian-friendly ) , ( setting , is , square ) , ( people , carry , bags ) , ( people , is , walking ) , ( individuals , is , seated ) , ( fountain , in center of , square ) , ( fountain , is , decorative ) , ( plants , is , potted ) , ( plants , surround , fountain ) , ( cables , is , overhead ) , ( power lines , is , overhead ) , ( buildings , surround , city ) , ( city , is , historic ) , ( city , is , older )",
descriptions[1]: "( man , sit at , desk ) , ( man , work on , computer ) , ( hand , lift to , forehead ) , ( man , have , hand ) , ( man , wear , glasses ) , ( shirt , is , blue ) , ( desk , house , monitor ) , ( desk , house , mouse ) , ( desk , is , cluttered ) , ( monitor , is , computer ) , ( man , in , room ) , ( room , is , dimly lit ) , ( window , in , background ) , ( image , capture , concentration ) , ( image , capture , productivity ) , ( productivity , is , intense )"
}
# Execute parsing
outputs = model.parse(
descriptions=descriptions,
graph_to_fix=graph_to_fix,
batch_size=2,
task="delete_before_insert"
)
# View results
print("Parsing results:")
print(outputs)
print("\nOutput keys:")
print(outputs.keys())
if __name__ == "__main__":
main()
Parameters
DualTaskSceneGraphParser Initialization Parameters
model_path(str): Path to the pre-trained modeldevice(str): Computing device, either "cuda" or "cpu"lemmatize(bool): Whether to lemmatize the textlowercase(bool): Whether to convert text to lowercase
parse Method Parameters
descriptions(List[str], required): List of image description textsgraph_to_fix(Dict[str, str], required): Scene graphs to fix, where keys are description texts and values are scene graph stringsbatch_size(int): Batch size for processingtask(str): Task type, e.g., "insert_delete", "delete_before_insert", "insert", "delete"
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
discosg-0.0.4.tar.gz
(37.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
discosg-0.0.4-py3-none-any.whl
(40.6 kB
view details)
File details
Details for the file discosg-0.0.4.tar.gz.
File metadata
- Download URL: discosg-0.0.4.tar.gz
- Upload date:
- Size: 37.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a974809a19e26604d6f3b2278f0a5948851c6c9767b2fa639b95b84b0934d0d6
|
|
| MD5 |
c6dd0d1c2fa771e4007eed0b7133f67b
|
|
| BLAKE2b-256 |
9f04df8936d78f94caf71c63f633764a6324231d17a0ceb3a5dc885bcecf6d04
|
File details
Details for the file discosg-0.0.4-py3-none-any.whl.
File metadata
- Download URL: discosg-0.0.4-py3-none-any.whl
- Upload date:
- Size: 40.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb389fc64a4ef9fd59d42cf5ffb5ea76fd875b4239a4bf944aaf7f01c3bef022
|
|
| MD5 |
da88c883c9f8ad828d2dfbcf3e2cc30b
|
|
| BLAKE2b-256 |
23f6fee7836cd5b61a118a64c6bbb5e5084b1565c6455ccdfb4f4be197bc1ba4
|