No project description provided

These details have not been verified by PyPI

Project description

Utility-Lens

dev-version 0.1.0c

Is your language model a rational decision maker? Does it consistently prefer $100 over $50? Will it still prefer a sandwich over an apple when a banana is available? Can we extract meaningful utility values from its choices?

Utility-Lens helps answer these questions by testing language models against the classical axioms of rational choice theory and extracting underlying utility functions from their revealed preferences.

Core Analyses

Transitivity: Does your model make logically consistent choices?
Independence of Irrelevant Alternatives (IIA): Does your model maintain its preferences when new alternatives are introduced?
Utility Estimation: What cardinal utilities best explain your model's choices?

Installation

pip install utility-lens

Demos

A. Testing Transitivity

Check if your model's preferences form a rational ordering:

Code

from utility_lens import OpenAIModel, TransitivityAnalyzer

# Required: OpenAI API key
openai_api_key = ''

# Required: List of items to compare
animals = [ "elephant", "human", "chimpanzee", "komodo", ...]

###################
# Initialize model#
###################
model = OpenAIModel(
    # Required parameters:
    model_name="gpt-3.5-turbo-0125",  # Required: Name of model to use
    api_key=openai_api_key,           # Required: OpenAI API key

    # Optional parameters (with defaults):
    base_url=None,         # Optional (default=None): Base URL for API 
    max_tokens=10,         # Optional (default=10): Max tokens in response
    concurrency_limit=100, # Optional (default=50): Max concurrent calls
)

#########################
# Initialize analyzer   #
#########################
analyzer = TransitivityAnalyzer(
    # Required parameters:
    model=model,              # Required: Model instance to use
    items=animals,             # Required: List of items to compare

    # Optional parameters (with defaults):
    n_trial=10,             # Optional (default=10): # API calls per pair
                            #  (how many times to ask the same question)
    n_triad=200,            # Optional (default=200): Number of triads
                            # Use -1 for all possible triads
    seed=42,                # Optional (default=42): Random seed
    save_directory="results"# Optional (default=None): Dir to save results
                            # None means don't save results
)

############################
# Run transitivity analysis#
############################
results = analyzer.run(
    use_async=True  # Optional (default=True): Processing mode
                    # True = Concurrent processing (faster but needs async support)
                    # False = Sequential processing (works everywhere)
)

#################################
# Print key transitivity metrics#
#################################
print("\nTransitivity Analysis Results:")
print(f"Overall transitivity score: {results['transitivity_score']:.3f}")
print(f"Weak stochastic transitivity: {results['weak_stochastic_transitivity_satisfied']}")
print(f"Strong stochastic transitivity: {results['strong_stochastic_transitivity_satisfied']}")

# Print top cycles (if any)
print("\nTop preference cycles found:")
for cycle in results['possible_cycles'][:3]:  # Show top 3 cycles
    print(f"\nProbability: {cycle['probability']:.3f}")
    print(f"Path: {cycle['cycle_path']}")
    print(f"Items involved: {cycle['triad']}")

############################################
# Results dictionary structure explanation #
############################################

# Results structure:
# {
#    'transitivity_score': float,    # Overall transitivity (0-1)
#                                   # 1 = perfectly transitive
#                                   # 0 = completely cyclic
#
#    'weak_stochastic_transitivity_satisfied': str,  # Format: "X/Y"
#                                   # X = number of triads satisfying WST
#                                   # Y = total triads tested
#
#    'strong_stochastic_transitivity_satisfied': str,  # Format: "X/Y"
#                                   # X = number of triads satisfying SST
#                                   # Y = total triads tested
#
#    'possible_cycles': List[Dict],  # List of detected preference cycles
#                                   # Sorted by probability (highest first)
#                                   # Each dict contains:
#                                   # - 'probability': float
#                                   # - 'cycle_path': str description
#                                   # - 'triad': List[str] items involved
#
#    'triad_results': List[Dict],   # Detailed results for each triad
#                                   # Including preference strengths and
#                                   # transitivity violations
#
#    'raw_data': List[Dict]         # Raw comparison data from model
# }

B. Testing IIA

Verify if preferences remain stable with new options:

Code

from utility_lens import OpenAIModel, IIAAnalyzer

# Required: OpenAI API key
openai_api_key = ''

# Required: List of items to compare
animals = [ "elephant", "human", "chimpanzee", "komodo", ...]

###################
# Initialize model#
###################
model = OpenAIModel(
    # Required parameters:
    model_name="gpt-3.5-turbo-0125",  # Required: Name of model to use
    api_key=openai_api_key,           # Required: OpenAI API key

    # Optional parameters (with defaults):
    base_url=None,         # Optional (default=None): Base URL for API 
    max_tokens=10,         # Optional (default=10): Max tokens in response
    concurrency_limit=100, # Optional (default=50): Max concurrent calls
)

#########################
# Initialize analyzer   #
#########################
analyzer = IIAAnalyzer(
    # Required parameters:
    model=model,            # Required: Model instance to use
    items=animals,          # Required: List of items to compare

    # Optional parameters (with defaults):
    n_trial=10,             # Optional (default=10): # API calls per pair
                            #  (how many times to ask the same question)
    n_pairs=-1,             # Default=200: Use 200 out of all possible  
                            #  pairs. Can also use -1 for all pairs or 
                            #  specify a number
    seed=42,                # Optional (default=42): Random seed
    threshold=0.1,          # Optional (default=0.1): IIA threshold
    save_directory="results"# Optional (default=None): Dir to save results
                            # None means don't save results
)

########################
# Run IIA analysis     #
########################
results = analyzer.run(
    use_async=True  # Optional (default=True): Use async processing
)

#########################
# Print IIA metrics     #
#########################
print("\nIIA Analysis Results:")
print(f"Overall IIA score: {results['iia_score_I']:.3f}")

############################################
# Results dictionary structure explanation #
############################################

# Results structure:
# {
#    'iia_score': float,        # Overall IIA satisfaction score (0-1)
#                               # 1 = perfect IIA satisfaction
#                               # 0 = complete IIA violation
#
#    'stable_preferences': int, # Number of pairs with stable preferences
#    'total_pairs': int,        # Total pairs tested
#
#    'violations': List[Dict],  # List of IIA violations
#                               # Sorted by magnitude (largest first)
#                               # Each dict contains:
#                               # - 'base_pair': (str, str)
#                               # - 'original_preference': float
#                               # - 'context_item': str
#                               # - 'new_preference': float
#                               # - 'shift': float
#
#    'pair_results': List[Dict],# Detailed results for each pair
#                               # Including preference strengths
#
#    'raw_data': List[Dict]     # Raw comparison data from model
# }

C. Extracting Utilities

Compute the underlying utilities that best explain the observed choices:

Code - Using Bradley-Terry

from utility_lens import OpenAIModel, UtilityAnalyzer
import numpy as np

# Required: OpenAI API key
openai_api_key = ''

# Required: List of items to compare
animals = [ "elephant", "human", "chimpanzee", "komodo", ...]

###################
# Initialize model#
###################
model = OpenAIModel(
    # Required parameters:
    model_name="gpt-3.5-turbo-0125",  # Required: Name of model to use
    api_key=openai_api_key,           # Required: OpenAI API key

    # Optional parameters (with defaults):
    base_url=None,        # Optional (default=None): Base URL for API 
    max_tokens=10,        # Optional (default=10): Max tokens in response
    concurrency_limit=100,# Optional (default=50): Max concurrent calls
                          #  Doesn't mean much unless using async
)

#########################
# Initialize analyzer   #
#########################
analyzer = UtilityAnalyzer(
    # Required parameters:
    model=model,            # Required: Model instance to use
    items=animals,          # Required: List of items to compare

    # Optional parameters (with defaults):
    n_trial=10,             # Optional (default=10): # samples per pair 
    n_pairs=-1,             # Default=200: Use 200 out of all possible  
                            #  pairs. Can also use -1 or None for all 
                            #  pairs or specify a number
    seed=42,                # Optional (default=42): Random seed 
    save_directory="results"# Optional (default=None): Set to save 
                            #  None means don't save results
)

################################
# Run Bradley-Terry analysis   #
################################
bt_results = analyzer.run(
    # All parameters are optional with defaults shown:
    method="bradley-terry", # Optional (default="bradley-terry")
                            #  Model type to use
    use_soft_labels=True,   # Optional (default=True): Use ratios vs binary
                            #  True = actual ratios (e.g., 7:3)
                            #  False = binary preferences (e.g., 1 or 0)
    num_epochs=1000,        # Optional (default=1000): Number of training epchs
    learning_rate=0.01,     # Optional (default=0.01): LR for optimization
    use_async=True          # Optional (default=True): Processing mode
                            #  False: Sequential (works everywhere)
                            #  True: Concurrent(faster but needs async support)
)

# Print Bradley-Terry rankings
print("\nBradley-Terry Rankings:")
print(f"Model accuracy: {bt_results['accuracy']:.3f}")
print("\nUtility Rankings:")
for item, utility in bt_results['rankings']:
    print(f"{item}: {utility:.3f}")

############################################
# Results dictionary structure explanation #
############################################

# Bradley-Terry results structure:
# {
#    'utilities': Dict,    # Maps item index to utility value
#                          # Example: {0: 1.2, 1: 0.8, 2: -0.5}
#    'rankings': List,     # Sorted (item, utility) pairs
#                          # Example: [("elephant", 1.2), ("human", 0.8)]
#    'accuracy': float,    # Model prediction accuracy (0-1)
#    'log_loss': float,    # Model log loss
#    'raw_data': Dict      # Raw preference data collected from model
# }

Code - Using Thurstonian

from utility_lens import OpenAIModel, UtilityAnalyzer
import numpy as np

# Required: OpenAI API key
openai_api_key = ''

# Required: List of items to compare
animals = [ "elephant", "human", "chimpanzee", "komodo", ...]

###################
# Initialize model#
###################
model = OpenAIModel(
    # Required parameters:
    model_name="gpt-3.5-turbo-0125",  # Required: Name of model to use
    api_key=openai_api_key,           # Required: OpenAI API key

    # Optional parameters (with defaults):
    base_url=None,        # Optional (default=None): Base URL for API 
    max_tokens=10,        # Optional (default=10): Max tokens in response
    concurrency_limit=100,# Optional (default=50): Max batch size 
                          #  Doesn't mean much unless using async
)

#########################
# Initialize analyzer   #
#########################
analyzer = UtilityAnalyzer(
    # Required parameters:
    model=model,            # Required: Model instance to use
    items=animals,          # Required: List of items to compare

    # Optional parameters (with defaults):
    n_trial=10,             # Optional (default=10): # samples per pair 
    n_pairs=-1,             # Default=200: Use 200 out of all possible  
                            #  pairs. Can also use -1 for all pairs or 
                            #  specify a number
    seed=42,                # Optional (default=42): Random seed 
    save_directory="results"# Optional (default=None): Set to save 
                            #  None means don't save results
)

#############################
# Run Thurstonian analysis  #
#############################
thurst_results = analyzer.run(
    # All parameters are optional with defaults shown:
    method="thurstonian",   # default="bradley-terry": Model type to use
                            #  Model type to use
    use_soft_labels=True,   # Optional (default=True): Use ratios vs binary
                            #  True = actual ratios (e.g., 7:3)
                            #  False = binary preferences (e.g., 1 or 0)
    num_epochs=1000,        # Optional (default=1000): Number of training epchs
    learning_rate=0.01,     # Optional (default=0.01): LR for optimization
    use_async=True          # Optional (default=True): Processing mode
                            #  False: Sequential (works everywhere)
                            #  True: Concurrent(faster but needs async support)
)


# Print Thurstonian rankings with uncertainty
print("\nThurstonian Rankings:")
print(f"Model accuracy: {thurst_results['accuracy']:.3f}")
print("\nUtility Rankings (mean ± std):")
for item, stats in thurst_results['rankings']:
    mean = stats['mean']
    std = np.sqrt(stats['variance'])
    print(f"{item}: {mean:.3f} ± {std:.3f}")

############################################
# Results dictionary structure explanation #
############################################

# Thurstonian results structure:
# {
#    'utilities': Dict,   # Maps item index to mean and variance
#                         # Example: {0: {'mean': 1.2, 'variance': 0.1}}
#    'rankings': List,    # Sorted by mean utility
#                         # Example: [("..", {'mean': 1, 'variance': 1})]
#    'accuracy': float,   # Model prediction accuracy (0-1)
#    'log_loss': float,   # Model log loss
#    'raw_data': Dict     # Raw preference data collected from model
# }

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.3

Mar 24, 2025

0.1.2

Mar 24, 2025

0.1.1

Mar 19, 2025

0.1.0

Nov 26, 2024

0.1.0rc0 pre-release

Dec 2, 2024

0.1.0b0 pre-release

Dec 1, 2024

0.1.0a0 pre-release

Nov 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

utility_lens-0.1.3.tar.gz (21.6 kB view details)

Uploaded Mar 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

utility_lens-0.1.3-py3-none-any.whl (23.5 kB view details)

Uploaded Mar 24, 2025 Python 3

File details

Details for the file utility_lens-0.1.3.tar.gz.

File metadata

Download URL: utility_lens-0.1.3.tar.gz
Upload date: Mar 24, 2025
Size: 21.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.13.0 Darwin/23.3.0

File hashes

Hashes for utility_lens-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`3a3cb47030a393e584b56925edb4e4429fa362208f07cb468a4b112eee5d2356`
MD5	`7927b163fa0412c291edb2315a585d2d`
BLAKE2b-256	`752397cd9e8de9262f85126e0ce0ecf598d6129731d553690c7c65117cd302ab`

See more details on using hashes here.

File details

Details for the file utility_lens-0.1.3-py3-none-any.whl.

File metadata

Download URL: utility_lens-0.1.3-py3-none-any.whl
Upload date: Mar 24, 2025
Size: 23.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.13.0 Darwin/23.3.0

File hashes

Hashes for utility_lens-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`32723280251c71a35706ec02c6d96177c841e4883feb25293ad2062231a738b3`
MD5	`34053a4f0d6ed369694bc35a64d943cd`
BLAKE2b-256	`930f4c7339d26239294ec4d298ec432125783878463f041e5cb1485560f8882f`

See more details on using hashes here.

utility-lens 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Utility-Lens

Core Analyses

Installation

Demos

A. Testing Transitivity

B. Testing IIA

C. Extracting Utilities

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes