Inspeq AI Python SDK
Project description
Inspeq Python SDK
- Website: Inspeq.ai
- Inspeq App: Inspeq App
- Detailed Documentation: Inspeq Documentation
Quickstart Guide
Installation
Install the Inspeq SDK and python-dotenv using pip:
pip install inspeqai python-dotenv
The python-dotenv
package is recommended for securely managing your environment variables, such as API keys.
Obtain SDK API Key and Project Key
Get your API key and Project Key from the Inspeq App
Usage
Here's a basic example of how to use the Inspeq SDK with environment variables:
import os
from dotenv import load_dotenv
from inspeq.client import InspeqEval
# Load environment variables
load_dotenv()
# Initialize the client
INSPEQ_API_KEY = os.getenv("INSPEQ_API_KEY")
INSPEQ_PROJECT_ID = os.getenv("INSPEQ_PROJECT_ID")
INSPEQ_API_URL = os.getenv("INSPEQ_API_URL") # Required only for our on-prem customers
inspeq_eval = InspeqEval(inspeq_api_key=INSPEQ_API_KEY, inspeq_project_id=INSPEQ_PROJECT_ID)
# Prepare input data
input_data = [{
"prompt": "What is the capital of France?",
"response": "Paris is the capital of France.",
"context": "The user is asking about European capitals."
}]
# Define metrics to evaluate
metrics_list = ["RESPONSE_TONE", "FACTUAL_CONSISTENCY", "ANSWER_RELEVANCE"]
try:
results = inspeq_eval.evaluate_llm_task(
metrics_list=metrics_list,
input_data=input_data,
task_name="capital_question"
)
print(results)
except Exception as e:
print(f"An error occurred: {str(e)}")
Make sure to create a .env
file in your project root with your Inspeq credentials:
INSPEQ_API_KEY=your_inspeq_sdk_key
INSPEQ_PROJECT_ID=your_project_id
INSPEQ_API_URL=your_inspeq_backend_url
Available Metrics
metrics_list = [
"RESPONSE_TONE",
"ANSWER_RELEVANCE",
"FACTUAL_CONSISTENCY",
"CONCEPTUAL_SIMILARITY",
"READABILITY",
"COHERENCE",
"CLARITY",
"DIVERSITY",
"CREATIVITY",
"NARRATIVE_CONTINUITY",
"GRAMMATICAL_CORRECTNESS",
"DATA_LEAKAGE",
"COMPRESSION_SCORE",
"FUZZY_SCORE",
"ROUGE_SCORE",
"BLEU_SCORE",
"METEOR_SCORE",
"COSINE_SIMILARITY_SCORE",
"INSECURE_OUTPUT",
"INVISIBLE_TEXT",
"TOXICITY",
"PROMPT_INJECTION"
]
Features
The Inspeq SDK provides a range of metrics to evaluate language model outputs:
Response Tone
Assesses the tone and style of the generated response.
Answer Relevance
Measures the degree to which the generated content directly addresses and pertains to the specific question or prompt provided by the user.
Factual Consistency
Measures the extent of the model hallucinating i.e. model is making up a response based on its imagination or response is grounded in the context supplied.
Conceptual Similarity
Measures the extent to which the model response aligns with and reflects the underlying ideas or concepts present in the provided context or prompt.
Readability
Assesses whether the model response can be read and understood by the intended audience, taking into account factors such as vocabulary complexity, sentence structure, and overall clarity.
Coherence
Evaluates how well the model generates coherent and logical responses that align with the context of the question.
Clarity
Assesses the response's clarity in terms of language and structure, based on grammar, readability, concise sentences and words, and less redundancy or diversity.
Diversity
Assesses the diversity of vocabulary used in a piece of text.
Creativity
Assesses the ability of the model to generate imaginative, and novel responses that extend beyond standard or expected answers.
Narrative Continuity
Measures the consistency and logical flow of the response throughout the generated text, ensuring that the progression of events remains coherent and connected.
Grammatical Correctness
Checks whether the model response adherence to the rules of syntax, is free from errors and follows the conventions of the target language.
Prompt Injection
Evaluates the susceptibility of language models or AI systems to adversarial prompts that manipulate or alter the system's intended behavior.
Data Leakage
Measures the extent to which sensitive or unintended information is exposed during model training or inference.
Insecure Output
Detects whether the response contains insecure or dangerous code patterns that could lead to potential security vulnerabilities.
Invisible Text
Evaluates if the input contains invisible or non-printable characters that might be used maliciously to hide information or manipulate the model's behavior.
Toxicity
Evaluates the level of harmful or toxic language present in a given text.
BLEU Score
Measures the quality of text generated by models by comparing it to one or more reference texts.
Compression Score
Measures the ratio of the length of the generated summary to the length of the original text.
Cosine Similarity Score
Measures the similarity between the original text and the generated summary by treating both as vectors in a multi-dimensional space.
Fuzzy Score
Measures the similarity between two pieces of text based on approximate matching rather than exact matching.
METEOR Score
Evaluates the quality of generated summaries by comparing them to reference summaries, considering matches at the level of unigrams and accounting for synonyms and stemming.
ROUGE Score
A set of metrics used to evaluate the quality of generated summaries by comparing them to one or more reference summaries.
Advanced Usage
Custom Configurations
You can provide custom configurations for metrics:
metrics_config = {
"response_tone_config": {
"threshold": 0.5,
"custom_labels": ["Negative", "Neutral", "Positive"],
"label_thresholds": [0, 0.5, 0.7, 1]
}
}
results = inspeq_eval.evaluate_llm_task(
metrics_list=["RESPONSE_TONE"],
input_data=input_data,
task_name="custom_config_task",
metrics_config=metrics_config
)
Error Handling
The SDK uses custom exceptions for different types of errors:
- APIError: For API related issues
- ConfigError: For invalid config related issues
- InputError: For invalid input data
Additional Resources
For detailed API documentation, visit Inspeq Documentation. For support or questions, contact our support team through the Inspeq App.
License
This SDK is distributed under the terms of the Apache License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.