Skip to main content

Python package for ESCARGOT that enables solving elaborate problems with LLMs and Knowledge Graphs

Project description

ESCARGOT

Overview

LLMs like GPT-4, despite their advancements, often produce hallucinations and struggle with integrating external knowledge effectively. While Retrieval-Augmented Generation (RAG) attempts to address this by incorporating external information, it faces significant challenges such as context length limitations and imprecise vector similarity search. ESCARGOT aims to overcome these issues by combining LLMs with a dynamic Graph of Thoughts and knowledge graphs, improving output reliability and reducing hallucinations. ESCARGOT significantly outperforms industry-standard RAG methods, particularly in open-ended questions that demand high precision. ESCARGOT also offers greater transparency in its reasoning process, allowing for the vetting of both code and knowledge requests, in contrast to the black-box nature of LLM-only or RAG-based approaches.

ESCARGOT significantly outperforms industry-standard RAG methods, particularly in open-ended questions that demand high precision. ESCARGOT also offers greater transparency in its reasoning process, allowing for the vetting of both code and knowledge requests, in contrast to the black-box nature of LLM-only or RAG-based approaches.

Results

Dataset GPT 3.5 Turbo Standard RAG ESCARGOT
Openended 1-hop (508 questions) 3.3% 50.2% 81.0%
Openended 2-hop (450 questions) 3.5% 12.8% 91.8%
True/False 1-hop (560 questions) 55.9% 73.0% 80.7%
True/False 2-hop (540 questions) 26.7% 64.4% 77.6%
Multiple Choice 1-hop (498 questions) 42.6% 77.7% 94.6%
Multiple Choice 2-hop (419 questions) 49.9% 81.9% 94.2%

Key Features

  1. Dynamic GoT Generation: ESCARGOT dynamically generates a Python-executable Graph of Thoughts (GoT) that integrates with knowledge graphs. This dynamic approach ensures improved accuracy and contextual relevance compared to static GoT frameworks.

  2. Strategic Planning and Execution: ESCARGOT's workflow includes strategy generation, assessment, code generation, XML conversion, and execution. This multi-step process ensures that each strategy is thoroughly evaluated and executed efficiently.

  3. Advanced Knowledge Retrieval:

    • Cypher Queries: ESCARGOT utilizes Cypher queries to extract structured and precise information from knowledge graphs like Memgraph, offering superior accuracy in data retrieval.
    • Vector Database Requests: As a backup, ESCARGOT leverages vector databases for similarity searches, enhancing the system’s ability to handle diverse queries even if Cypher queries fail.
  4. Direct Python Execution: The system supports direct Python execution for tasks requiring high precision. By converting knowledge into executable Python code, ESCARGOT ensures reliable and efficient computation, reducing the risk of errors and hallucinations.

  5. Self-Debugging and Adaptability: ESCARGOT includes built-in self-debugging capabilities. It can autonomously analyze and revise code if errors occur during execution, ensuring resilience and reducing the need for manual intervention.

  6. Enhanced Accuracy and Reduced Hallucinations: By integrating structured knowledge retrieval and direct code execution, ESCARGOT minimizes the risk of hallucinations and improves the overall accuracy of reasoning and computation.


Quick Install Through Pip

pip install escargot

Example Usage

Here's how to configure and use the Escargot library with your chosen models and databases:

Configuration

First, set up your configuration with the necessary parameters:

config = {
    "azuregpt35-16k" : {
        "model_id":"gpt-35-turbo-16k", 
        "prompt_token_cost": 0.001,
        "response_token_cost": 0.002,
        "temperature": 0.7,
        "max_tokens": 2000,
        "stop": None,
        "api_version": "API_VERSION",
        "api_base": "API_BASE_HERE",
        "api_key": "API_KEY_HERE",
        "embedding_id":"text-embedding-ada-002"
    },
    "memgraph" : {
        "host": "MEMGRAPH_URL",
        "port": 7687
    },
    "weaviate" : {
        "api_key": "WEAVIATE_API_KEY",
        "url": "WEAVIATE_URL",
        "db": "WEAVIATE_DB",
        "limit": 200
    }
}

Initializing Escargot

Initialize the Escargot instance with your configuration. Escargot will automatically connect to the Memgraph database and retrieve all the node types and relationships.

from escargot import Escargot

escargot = Escargot(
    config, model_name="azuregpt35-16k"
)

Ask a question

To query the Escargot library, use the ask function. You can specify optional parameters such as debug_level and answer_type to control the verbosity of logging and the format of the response.

Example of extracting information from the knowledge graph

# Basic question
response = escargot.ask("What is the function of the gene APOE?")
The gene APOE has several functions, including low-density lipoprotein particle receptor binding, protein homodimerization activity, steroid binding, heparin binding, tau protein binding, amide binding, lipoprotein particle receptor binding, and protein-lipid complex assembly.
# Advanced usage with additional parameters
response = escargot.ask(
    "What is the function of the gene APOE?",
    debug_level=0,       # Set the level of logging (0 to 3). 0 is default
    answer_type="natural"  # Specify the response format ("natural" or "array") "natural" is default.
)

Example of extracting an array of genes from the knowledge graph

escargot.ask("List the genes that are associated with the Alzheimer's disease", answer_type="array")
{'genes_list': ['GSK3B',
  'CASP3',
  'CHRNB2',
  'IGF2',
  'IQCK',
  'MS4A4A',
  'IL1B',
  'ACE',
  'VEGFA',
  'WWOX',
  'DPYSL2',
  'MIR4467',
...
  'INSR',
  'ABCA7',
  'SORL1',
  'HLA-DRB5',
  'ACHE']}

Manually Configure Knowledge Graph Schema (bypasses automated database schema extraction)

Retrieving the schema does take time, so it is useful to extract the schema once, manually configure it and insert it into the instantiation of Escargot. If you need to manually configure the node_types and relationships, you will skip the above automated knowledge graph schema retrieval and use the manually inputted configuration:

from escargot import Escargot

escargot = Escargot(
    config,
    node_types="BiologicalProcess, BodyPart, CellularComponent, Datatype, Disease, Drug, DrugClass, Gene, MolecularFunction, Pathway, Symptom",
    relationship_types="""CHEMICALBINDSGENE
CHEMICALDECREASESEXPRESSION
CHEMICALINCREASESEXPRESSION
DRUGINCLASS
DRUGCAUSESEFFECT
DRUGTREATSDISEASE
GENEPARTICIPATESINBIOLOGICALPROCESS
GENEINPATHWAY
GENEINTERACTSWITHGENE
GENEHASMOLECULARFUNCTION
GENEASSOCIATEDWITHCELLULARCOMPONENT
GENEASSOCIATESWITHDISEASE
SYMPTOMMANIFESTATIONOFDISEASE
BODYPARTUNDEREXPRESSESGENE
BODYPARTOVEREXPRESSESGENE
DISEASELOCALIZESTOANATOMY
DISEASEASSOCIATESWITHDISEASET""",
    model_name="azuregpt35-16k"
)
escargot.memgraph_client.schema = """Node properties are the following:
Node name: 'BiologicalProcess', Node properties: ['commonName']
Node name: 'BodyPart', Node properties: ['commonName']
Node name: 'CellularComponent', Node properties: ['commonName']
Node name: 'Disease', Node properties: ['commonName']
Node name: 'Drug', Node properties: ['commonName']
Node name: 'DrugClass', Node properties: ['commonName']
Node name: 'Gene', Node properties: ['commonName', 'geneSymbol', 'typeOfGene']
Node name: 'MolecularFunction', Node properties: ['commonName']
Node name: 'Pathway', Node properties: ['commonName']
Node name: 'Symptom', Node properties: ['commonName']
Relationship properties are the following:
The relationships are the following:
(:Drug)-[:CHEMICALBINDSGENE]-(:Gene)
(:Drug)-[:CHEMICALDECREASESEXPRESSION]-(:Gene)
(:Drug)-[:CHEMICALINCREASESEXPRESSION]-(:Gene)
(:Drug)-[:DRUGINCLASS]-(:DrugClass)
(:Drug)-[:DRUGCAUSESEFFECT]-(:Disease)
(:Drug)-[:DRUGTREATSDISEASE]-(:Disease)
(:Gene)-[:GENEPARTICIPATESINBIOLOGICALPROCESS]-(:BiologicalProcess)
(:Gene)-[:GENEINPATHWAY]-(:Pathway)
(:Gene)-[:GENEINTERACTSWITHGENE]-(:Gene)
(:Gene)-[:GENEHASMOLECULARFUNCTION]-(:MolecularFunction)
(:Gene)-[:GENEASSOCIATEDWITHCELLULARCOMPONENT]-(:CellularComponent)
(:Gene)-[:GENEASSOCIATESWITHDISEASE]-(:Disease)
(:Symptom)-[:SYMPTOMMANIFESTATIONOFDISEASE]-(:Disease)
(:BodyPart)-[:BODYPARTUNDEREXPRESSESGENE]-(:Gene)
(:BodyPart)-[:BODYPARTOVEREXPRESSESGENE]-(:Gene)
(:Disease)-[:DISEASELOCALIZESTOANATOMY]-(:BodyPart)
(:Disease)-[:DISEASEASSOCIATESWITHDISEASET]-(:Disease)"""

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

escargot-0.0.2.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

escargot-0.0.2-py3-none-any.whl (42.2 kB view details)

Uploaded Python 3

File details

Details for the file escargot-0.0.2.tar.gz.

File metadata

  • Download URL: escargot-0.0.2.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for escargot-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a75cc13380f49e0840bc8f7a9364f517e7715ecd6c47b39762c3e4cde40df42b
MD5 b8adb4921304af66e404089416e217cb
BLAKE2b-256 cd4649f3d7384667eeb505a6a848939e3e108260142f7b7d82c3976b90894266

See more details on using hashes here.

File details

Details for the file escargot-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: escargot-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 42.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for escargot-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 26f57cee059ea8c5f35c3544b6a9ac3fe022d6b0f3a2f379cff7fdc5edb3c8a3
MD5 a24ad9a72a6f193760a29ca906a3d55b
BLAKE2b-256 00cc9388e630db556bd63d159aca6586a3480ca8586d556df8f9acf501dd56d6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page